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Annual  Report  2003:  USAMRMC:  Gr ant#  D AMD  17 -01 -01 60:  P.I:  Sam  Thiagalingam,  Ph.  D. 
Metastatic  progression  of  breast  cancer  by  allelic  loss  on  chromosome  18q21. 

ANNUAL  REPORT  OF  THE  USAMRMC  FUNDED  ACTIVITY 
Title  of  the  grant:  Metastatic  progression  of  breast  cancer  by  allelic  loss  on  chromosome  18q21. 

1.  Introduction/  Project  Overview/  Scientific  Progress  and  future  directions: 

The  majority  of  molecular  genetic  studies  on  breast  cancer  have  focused  on  familial  predisposition 
and  there  has  been  a  lack  of  serious  effort  to  understand  the  molecular  basis  of  the  involvement  of 
genetic  determinants  in  the  progression  to  metastatic  cancer.  The  fact  that  18q  loss  has  been 
predominantly  associated  with  the  advanced  carcinoma  stage  of  cancers  suggests  that  the  genes 
inactivated  by  this  specific  alteration  or  other  genes  in  the  pathway  targeted  for  the  inactivation  could 
be  associated  with  the  conversion  of  benign  tumors  to  malignancy  and  metastatic  progression  of  breast 
cancer.  However,  unlike  in  pancreatic,  colon,  lung  and  ovarian  cancers,  the  lack  of  mutations  in  breast 
cancer  in  the  Smad2  and  Smad4  genes  localized  to  chromosome  18q,  strongly  supports  the  existence  of 
alternate  target  genes  in  breast  cancer. 

Disabling  Smad  signaling  in  cancer  has  become  increasingly  recognized  as  an  important  step  that 
affects  processes  such  as  loss  of  growth  inhibition,  promotion  of  angiogenesis  and  metastasis  and  the 
epithelial  mesenchymal  transition  (1,2).  Our  survey  of  the  various  Smad  genes  has  provided  the  first 
clues  in  identifying  the  Smad8  gene  as  an  important  target  for  loss  of  expression  in  nearly  30%  of 
breast  cancers  which  we  believe  is  a  significant  finding  as  even  the  most  celebrated  tumor  marker, 
HERJneu  gene  amplification,  also  occurs  in  about  20%-30%  breast  cancer  cases  (3).  We  report  here  that 
we  have  extended  these  initial  observations  to  demonstrate  that  the  inactivation  of  the  Smad8  gene 
leading  to  loss  of  its  expression  is  mediated  by  epigenetic  DNA  methylation  (4).  It  still  remains  to  be 
determined  whether  Smad8  inactivation  could  also  be  an  alternate  target  for  Smad2  or  Smad4 
inactivation. 

Furthermore,  our  investigation  of  the  potential  role  of  Smad4  inactivation  revealed  that  the  gene 
expression  pattern  in  cell  culture  models  that  lack  Smad4  could  favor  angiogenesis/  metastasis,  which 
is  further  enhanced  by  TGFp  and  hypoxia.  We  are  continuing  to  characterize  these  cell  culture  models 
to  identify  the  mediator  and  effecter  genes,  which  regulate  metastatic  progression  of  breast  cancer  upon 
inactivation  of  the  Smad4  signaling  pathway. 

2.  Modified  tasks  that  were  approved  following  the  first  annual  report,  their  expansion  to  set 

specific  goals,  summary  of  findings  and  future  directions: 

We  have  further  expanded  the  tasks  1  &  2  to  set  specific  goals  incorporating  our  recent  findings 
that  are  aimed  at  increasing  the  understanding  of  the  implications  of  the  role  of  chromosome  18q  loss  in 
the  molecular  basis  of  metastatic  breast  cancer. 

Task  1.  Determination  and  identification  of  genetic  and  epigenetic  alterations  in  known  and 
novel  Smads  as  potential  target  genes  and  the  elucidation  of  their  implications  to  metastatic 
breast  cancer. 

We  have  employed  a  novel  technique  known  as  TEGD  (targeted  expressed  gene  display)  to 
identify  that  the  loss  of  Smad8  gene  expression  is  the  major  Smad  gene  target  for  inactivation  in 
breast  cancer.  We  also  demonstrate  that  the  epigenetic  silencing  of  Smad8  expression  by  DNA 
hypermethylation  directly  correlates  with  loss  of  Smad8  expression.  We  are  in  the  process  of 
molecular  cloning  wildtype  and  defective  Smad8  genes  to  carry  out  adding  back  experiments  to 
further  understand  the  role  of  the  Smad8  inactivation  in  cancer.  We  also  plan  to  determine 
whether  Smad8  inactivation  could  be  an  alternate  target  for  the  inactivation  of  Smad2  or  Smad4 
or  deregulation  of  Smad7. 
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Task  2.  Identification  and  elucidation  of  the  roles  of  alternate  target  genes  involved  in  the 
Smad4  signaling  pathway. 

We  have  made  progress  in  establishing  isogenic  cell  culture  model  systems  that  are 
proficient  and  deficient  in  Smad4  expression.  Our  preliminary  data  strongly  support  that  the 
Smad4  defect  could  be  a  critical  contributor  in  the  gene  expression  pattern  that  favor 
angiogenesis/  metastasis.  Interestingly,  these  events  are  highly  favored  by  TGF0  and  hypoxia 
consistent  with  the  conditions  that  promote  advanced  cancer.  Further  characterization  and 
identification  of  mediator  and  effecter  genes  that  promote  angiogenesis/  metastasis  under  these 
conditions  and  the  identification  of  potential  cofactors  that  could  interact  with  Smad4  and  hence 
could  be  alternate  target(s)  for  inactivation  in  breast  cancer  are  in  progress. 

Task  3.  Evaluation  of  candidate  target  genes. 

This  task  remains  unmodified  and  would  begin  once  we  have  identified  legitimate  target  genes 
in  Tasks  1  &  2. 

We  have  made  substantial  progress  towards  not  only  the  identification  of  the  major  Smad  gene 
target  ( Smad8 )  but  also  the  molecular  basis  of  its  loss  of  function  in  breast  cancer.  We  have  also 
established  the  model  systems  and  conditions  that  should  aid  us  in  the  discovery  of  alternate  targets  in 
Smad4  signaling  as  well  as  the  effecter/  mediator  genes  that  are  involved  in  the  genesis/  progression  of 
metastatic  breast  cancer.  We  believe  that  these  studies  could  provide  important  insights  into  the 
molecular  basis  of  breast  cancer  metastasis  leading  to  better  diagnosis,  prognosis  and  therapy  of  the 
disease. 

3.  Body:  Procedures  and  progress  report: 
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Figure  1 .  Targeted  expressed  gene  display  (TEGD). 
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A.  Schematic  representation  of  TEGD  for  the  Smad  family  of  genes. 

MH1  and  MH2  indicate  highly  homologous  regions  in  the  amino  acid  as  well  as  DNA  sequence  among  the  various  Smad  gene  family  members.  The 
forward  and  reverse  primers  for  PCR  amplification  of  the  cDNA  were  designed  in  the  conserved  regions  as  indicated.  The  radiolabeled  PCR  products 
were  analyzed  by  denaturing  acrylamide  gel  electrophoresis.  B.  PCR  products  for  SMADs  using  degenerate  primers  were  analyzed  by  TEGD.  Lanes 
Bl-8  correspond  to  PCR  products  generated  using  cDNA  templates  from  the  normal  mammary  gland  cells  (Bl)  and  tumor  or  cell  line  (B2-8) 
samples.  B8  is  a  cell  line  (MDAMB468)  with  a  homozygous  deletion  for  Smad4  and  serves  as  an  internal  control.  The  arrows  point  to  distinct  PCR 
products  that  were  abnormal  compared  to  the  normal  control.  The  positions  of  various  Smad  genes  and  their  variants  as  identified  from  sequence 
analysis  are  indicated  on  the  right  panel. 

1  2  3  4  5  6  7  8  9  10  11  12  13  14 


Smad8a 

Smad8p 

Smad8y 

Smad3a 

Smad3  p 


Figure  2.  Semi-quantitative  RT-PCR  analysis  of  Smad8  expression  in  breast  cancer. 

Total  RNA  was  prepared  using  the  Trizol  method  from  the  indicated  breast  cancer  specimens  and  analyzed  by  RT-PCR.  Lane  1  is  a  normal  breast 
sample  and  lanes  2-4  and  12-14  are  primary  tumor  samples,  5-1 1  are  cell  lines.  Smad8a,  Smad8p  and  Smad8y  are  three  of  the  major 
differentially  spliced  forms  of  Smad8  which  correspond  to  the  full-length,  deletion  of  exon  2,  and  deletions  of  exons  2&3,  respectively.  Analysis 
of  the  Smad3  gene  is  used  for  normalization  and  quantitation  of  Smad8. 

The  Smad  family  of  genes  has  highly  homologous  amino  acid  sequences  at  their  N-  and  C- 
terminal  regions  (MH1  and  MH2  respectively),  which  are  separated  by  a  highly  divergent  linker 
region  rich  in  proline,  serine  and  threonine  (1;  Figure  1A).  We  have  effectively  exploited  TEGD 
as  a  tool  to  identify  Smad8  gene  as  a  critical  target  for  loss  of  function  due  to  down  regulation  of 
gene  expression  in  breast  cancer  (Figures  1 A  &  IB).  Subsequent  analysis  of  the  Smad8  gene 
using  gene  specific  primers  by  semi  quantitative  RT-PCR  in  breast  and  other  cancers  showed 
loss  of  expression  in  nearly  31%  (1 1/35)  of  breast  cancers  (Figure  2).  We  believe  that  it  is  a 
significant  finding  as  even  the  most  celebrated  tumor  marker  for  breast  cancer,  the  HER/neu 
gene  amplification,  occurs  in  about  20%-30%  breast  cancer  cases  (3). 

We  have  decided  to  extend  these  observations  to  investigate  potential  mechanisms  for  the 
loss  of  Smad8  gene  expression  due  to  the  high  level  of  significance  of  this  alteration  in  breast 
cancer  and  its  potential  implication  to  the  design  of  diagnostic  and  therapeutic  strategies.  Since 
our  analysis  of  chromosomal  deletions  was  negative,  we  considered  epigenetic  silencing  of  gene 
expression  due  to  DNA  methylation  and  associated  chromatin  modification  (4).  DNA  sequence 
analysis  of  the  bisulfite  treated  genomic  DNA  revealed  that  CpG  islands  localized  to  nucleotides 
3541028  to  35410583  (Chromosome  13ql2-14  (on  the  reverse  strand  between  Rb  and  BRCA2; 
UCSC  genome  browser  http://genome.ucsc.edu)  in  the  first  intron  of  the  Smad8  gene  is  only 
methylated  in  cancers  that  exhibited  loss  of  expression  (data  not  shown).  We  confirmed  these 
observations  using  methylation  specific  PCR  (MSP)  using  primers  designed  to  these 
corresponding  differentially  methylated  regions  and  the  results  were  consistent  with  the  earlier 
observations  that  the  Smad8  gene  is  silenced  in  breast  cancer  due  to  DNA  hypermethylation 
affecting  CpG  islands  in  the  first  intron  of  the  Smad8  gene  (Figure  3A). 

Furthermore,  the  physiological  significance  of  the  role  of  DNA  hypermethylation  in 
Smad8  gene  silencing  was  established  with  the  ability  to  recover  gene  expression  upon  treatment 
with  5’-aza-2’-deoxycytidine  (5Aza-dC;  a  DNA  demethylating  agent)  in  cell  lines  that  were 
previously  determined  as  exhibiting  DNA  hypermethylation  mediated  gene  silencing  of  Smad8 
(Figure  3B).  These  observations  strongly  support  the  loss  of  Smad8  expression  in  breast  cancer 
is  primarily  mediated  by  hypermethylation  of  cis-regulatory  CpG  islands  of  the  gene. 
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Figure  3.  Epigenetic  gene  silencing  of  the  SMAD8  gene  by  altered  DNA  methylation  patterns. 

A.  MSP  (Methylation  specific  PCR)  analysis  of  the  CpG  islands  of  intron  1  of  the  Smad8  gene  in  the  indicated  breast  (MDAMB231,  MDAMB468, 
MDAMB435S)  and  prostate  (LNCaP,  Dul45)  cancer  cell  lines  that  are  either  proficient  (+)  or  deficient(-)  in  Smad8  expression.  Placental  DNA 
(PDNA)  and  in  vitro  methylated  DNA  (IVM)  serve  as  negative  and  positive  controls.  Lanes  U  and  lanes  M  indicate  the  presence  of  unmethylated 
and  methylated  templates,  respectively. 

B.  The  indicated  cell  lines  were  treated  with  l-5pM  5-AZA-dC  for  7  days  or  with  300pM  TSA  for  24hrs.  To  assess  the  effect  of  both  5-AZA-dC 
and  TSA  simultaneously,  cells  were  exposed  sequentially  for  7  days  to  5-AZA-dC  and  subsequently  to  300  pM  TSA  for  an  additional  24  hrs.  Total 
RNA  and  genomic  DNA  were  isolated  and  Smad8  expression  and  DNA  hypermethylation  were  determined  by  RT-PCR  and  MSP  analysis  (data  not 
shown),  respectively. 
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Figure  4.  Relationship  between  Smad4  status  and  the  expression  of  VEGF. 

A.  Western  blotting  was  used  to  screen  for  stable  cell  lines  that  constitutively  express  Smad4  and  corresponding  isogenic  controls  that  have 
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integrated  the  empty  vector.  Lanes  1-5  correspond  to  derivatives  of  a  colon  cancer  cell  line  (CC1)  with  Smad4-/-  stably  transfected  with  empty 
vector  (1&2)  or  pCMV-Smad4  (3-5).  Lanes  6-9  are  a  breast  cancer  cell  line,  BR05(Smad4mt)  stably  transfected  with  an  empty  vector  (6&8)  or 
TGFp-RII  receptor  (7)  or  pCMV-Smad4  (9).  Please  note  that  the  clone  in  lane  5  is  a  false  positive.  B.  Effect  of  over-expression  of  the  Smad4 
gene  on  VEGF.  Total  RNA  was  analyzed  with  the  RiboQuant  probes  (  BD-PharMingen,  San  Diego,  CA)  to  detect  the  indicated  mRNAs. 
GAPDH  was  included  as  an  internal  control.  Cl,  T1  &  T2  are  stable  transfectants  of  CC1  Smad4-/-  with  empty  vector  (Cl)  or  Smad4  expression 
vector  (T1  &  T2).  The  evaluations  were  made  in  the  presence/  absence  of  TGFp  and  under  normoxic/  hypoxic  conditions  as  indicated. 


In  summary,  we  conclude  that  our  preliminary  data  provides  the  first  direct  evidence  that 
silencing  of  gene  expression  via  DNA  hypermethylation  of  the  Smad8  gene  could  be  an 
important  event  in  breast  cancer  progression  and  metastasis. 

Furthermore,  preliminary  results  from  the  experiments  to  investigate  the  role  of  Smad4  in 
cancer  metastasis  are  encouraging  as  the  introduction  of  wild-type  Smad4  into  a  colon  cancer 
cell  line  with  homozygous  deletion  of  Smad4  exhibited  a  decrease  in  VEGF  expression  (Figure 
4).  Interestingly,  the  presence  of  TGFp  and  hypoxic  conditions  that  mimic  advanced  tumors 
elicited  a  significant  increase  in  the  expression  of  VEGF,  a  marker  for  angiogenesis/  metastasis. 
These  studies  are  currently  being  repeated  in  breast  cancer  cell  culture  models. 

We  are  planning  to  extend  these  studies  to  not  only  confirm  this  phenomenon  with  other 
candidate  genes  but  also  identify  a  wide  spectrum  of  other  critical  genes  important  for  the 
metastatic  progression  of  breast  cancer  using  the  microarray  (Affymetrix)  technology. 

Once  legitimate  metastasis  mediator  and  effecter  gene(s)  are  identified,  evaluation  of  the 
status  of  the  candidate  gene(s)  for  inactivation/  activation  in  metastatic  breast  cancer  will 
commence  as  described  in  the  original  proposal  (5;  Task  3). 

4.  Key  research  accomplishments: 

Our  study  provide  the  first  direct  evidence  that  30%  of  the  breast  cancers  exhibit  loss  of 
Smad8  expression  and  makes  it  as  one  of  the  highly  valued  markers  similar  to  Her/neu.  Our 
studies  also  provide  the  first  direct  evidence  that  the  silencing  of  gene  expression  via  DNA 
hypermethylation  of  the  Smad8  gene  could  be  an  important  event  in  breast  cancer  progression 
and  metastasis.  Therefore,  Smad8  has  the  potential  to  become  a  key  target  for  the 
development  of  diagnostic,  prognostic  and  therapeutic  strategies  to  combat  breast  cancer. 

We  have  also  identified/  generated  appropriate  tumor  cell  lines  as  well  as  experimentally 
developed  derivative  test  and  control  cell  lines  as  model  systems  to  identify  and  isolate  the 
metastatic  breast  cancer  mediator  and  effecter  genes  involved  in  the  Smad4  signaling  pathway. 

5.  Conclusions: 

(1)  The  loss  of  Smad8  expression  in  breast  cancers  is  primarily  mediated  by  gene  silencing 
due  to  epigenetic  DNA  methylation  of  regulatory  regions. 

(2)  A  combination  of  Smad4  inactivation,  high  levels  of  TGFp  and  hypoxic  conditions  could 
favor  angiogenesis/  metastasis. 

(3)  The  identification  of  target  gene(s)  that  disable  Smad4  or  Smad8  signaling  to  promote 
breast  cancer  could  potentially  provide  not  only  novel  and  valuable  diagnostic  and 
prognostic  tumor  markers  but  also  key  arsenals  to  combat  breast  cancer. 
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ABSTRACT 

To  address  the  challenge  of  identifying  related  members  of  a  large 
family  of  genes,  their  variants  and  their  patterns  of  expression,  we  have 
developed  a  novel  technique  known  as  Targeted  Expressed  Gene  Display 
(TEGD).  Here  we  demonstrate  the  general  application  of  this  technique  by 
analyzing  the  Smad  genes,  and  report  that  the  loss  of  Smad8  expression  is 
associated  with  multiple  types  of  cancers,  including  31%  of  both  breast 
and  colon  cancers.  Epigenetic  silencing  of  Smad8  expression  by  DNA 
hypermethylation  in  cancers  directly  correlates  with  loss  of  Smad8 
expression.  The  Smad8  alteration  in  a  third  of  breast  and  colon  cancers 
makes  it  a  significant  novel  tumor  marker  as  well  as  a  potential  therapeutic 
target.  The  utility  of  TEGD  as  demonstrated  by  the  analysis  of  Smad 
genes  suggests  that  it  is  an  efficient  tool  for  the  initial  discovery  of 
alterations  in  expressed  genes  within  highly  homologous  gene  families. 
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INTRODUCTION 

Methods  such  as  RT-PCR,  cDNA  subtraction,  differential  display  (DD), 
representational  difference  analysis  (RDA),  serial  analysis  of  gene  expression 
(SAGE)  and  microarrays  have  been  widely  used  in  the  identification  of  novel 
transcripts  as  well  as  in  the  assessment  of  their  levels  of  expression  in 
development,  various  cellular  processes  and  diseases  including  cancer. 
Despite  the  usefulness  of  these  techniques  in  the  overall  assessment  of 
genes  that  are  highly  divergent  at  the  DNA  sequence,  accurate  and  high 
throughput  evaluation  and  discovery  of  related  members  of  a  gene  family 
have  remained  a  challenge.  These  methods  in  general  have  been  unable  to 
discriminate  between  different  members  of  the  gene  families  with  consistency 
because  of  the  inherent  redundancy  in  DNA  sequence  among  these  unique 
genes  and  transcripts.  A  novel  method  described  here,  targeted  expressed 
gene  display  (TEGD),  validated  using  the  Smad  family  of  genes  as  the 
prototype,  enables  one  to  overcome  this  dilemma  when  gene  family  members 
contain  at  least  two  regions  of  homology  separated  by  a  divergent  region  of 
variable  length. 

The  discovery  of  the  Smad  family  of  signal  transducer  proteins  as 
mediators  of  TGFp  (transforming  growth  factor-beta)  signaling  from  the  cell 
membrane  to  the  nucleus  has  revolutionized  the  understanding  of  the 
molecular  basis  of  the  signaling  and  inactivation  of  TGFp/  BMP  pathways  in 
cancer  (1).  To  date,  eight  human  homologues  of  the  Smad  genes  have  been 
identified  and  are  classified  into  three  distinct  classes  based  on  their 
structures  and  biological  functions  (1,2).  The  first  category  consists  of 
pathway-restricted  or  receptor-regulated  Smads  (R-Smads):  Smadl,  Smad5 
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and  Smad8,  which  are  involved  in  BMP  signaling  and  Smad2  and  Smad3 
which  are  TGFp  /activin  pathway  restricted.  These  Smads  are  activated 
directly  via  phosphorylation  by  Rl  receptors  following  the  formation  of  a 
complex  consisting  of  the  ligand  bound  heteromeric  Rl/  Rll  receptors. 
Phosphorylated  R-Smads  interact  with  the  second  class  of  Smads  known  as 
the  common  mediator  Smad  (Co-Smad)  to  form  a  heteromeric  complex  (3). 
Smad4  is  the  only  member  of  this  class  of  Smads  known  in  mammals.  The 
third  class  of  Smads  includes  Smad6  and  Smad7  which  were  identified  as 
anti-Smads  or  inhibitory  Smads  (l-Smad)  due  to  their  ability  to  act  as  inhibitors 
of  the  signaling  pathway  (4-6). 

Since  the  signaling  pathways  mediated  by  the  members  of  the  TGFp 
family  are  implicated  in  a  number  of  biological  processes  including  cell 
differentiation,  cell  proliferation,  determination  of  cell  fate  during 
embryogenesis,  cell  adhesion,  cell  death,  angiogenesis,  metastasis  and 
immunosuppression,  it  is  conceivable  that  genetic  or  epigenetic  anomalies 
leading  to  altered  expression  patterns  of  various  Smad  molecules  could 
contribute  to  different  aspects  of  neoplastic  progression  (2, 7-10).  Although 
there  has  been  significant  progress  in  elucidating  the  association  between 
genetic  alterations  in  the  Smad4  gene  and  cancer,  the  nature  of  defects 
involving  the  other  Smads  has  been  elusive  (11-16).  The  apparent  lack  of 
genetic  alterations  in  the  majority  of  Smad  genes  analyzed  thus  far  in  cancer 
provides  compelling  support  for  the  potential  role  of  epigenetic  alterations, 
whereby  abnormalities  in  signaling  could  occur  at  the  level  of  regulation  of 
gene  expression  or  processing  of  the  transcripts  (17-19).  Our  analysis  of  the 
Smad  genes  provides  evidence  for  the  exploitation  of  the  novel  TEGD  method 
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described  in  this  article  in  the  initial  determination  of  the  mode  of  inactivation 
of  the  Smad  genes  in  cancer.  Thus,  we  predict  that  the  effective  utilization  of 
the  method  described  here  will  find  wide  use  not  only  in  the  discovery  of  novel 
members  of  a  family  of  genes  and  splice  variants  of  a  specific  gene,  but  also 
for  the  simultaneous  analysis  of  the  transcript  levels  of  individual  genes  or 
their  spliced  variants  in  various  diseases  and  during  development. 

MATERIALS  AND  METHODS 

Cell  culture,  RNA  isolation  and  cDNA  synthesis. 

Cancer  cell  lines  were  purchased  from  ATCC  or  Coriell  Cell  Repository 
and  culture  conditions  were  followed  as  suggested  by  the  provider.  Tumor 
samples,  some  of  the  cell  lines  and  their  derivatives  or  nucleic  acids  isolated 
from  the  samples  used  in  this  study  were  obtained  from  Subra  Kugathasan 
(Medical  College  of  Wisconsin),  Peter  Thomas  (Boston  University  School  of 
Medicine),  Douglas  Faller  (Bostbn  University  School  of  Medicine),  Ramon 
Parsons  (Columbia  University)  and  Kornelia  Polyak  (Dana  Farber  Cancer 
Institute).  RNA  isolation  and  cDNA  synthesis  from  the  cell  lines  and  tumor 
samples  were  carried  out  using  previously  described  procedures  (20). 

Smad  genes  degenerate  RT-PCR. 

Based  on  the  amino  acid  sequences  of  the  human  Smads  1-8,  regions 
that  are  identical  and  conserved  (MH1  and  MH2)  among  the  Smads  were 
mapped  out  (1 ,  2).  The  residues  targeted  for  the  primer  design  were  localized 
to  the  MH1  and  MH2  domains,  and  the  intervening  linker  regions  were  highly 
divergent  enabling  the  generation  of  PCR  products  that  are  of  unique  size 
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corresponding  to  specific  Smad  homolog(s).  The  forward  and  reverse  primers 
were  designed  based  on  the  maintenance  of  codon  degeneracy  and  the 
representation  of  the  various  amino  acids  at  a  given  position  among  the 
known  Smad  family  members  as  determined  from  the  sequence  alignment  of 
the  various  homologs.  All  primers  were  obtained  from  Integrated  DNA 
Technologies,  Coralville,  IA. 

The  Smad  family  specific  degenerate  primers  used  for  TEGD  are  as 
follows:  SmadXF2(5’  primer)  -TNTKBMGVTGGCCNGAYYTBM;  SmadXR1(3’ 
primer)  -  CCAVCCYTTSRCRAARCTBAT  (Codes  for  mixing  of  bases  to 
generate  degeneracy:  R=A,G;  Y=C,T;  M=A,C;  K=G,T;  S=C,G;  W=A,T; 
H=A,C,T;  V=A,C,G;  D=A,G,T;  N=A,C,G,T). 


A  20  pi  PCR  reaction  mixture  contained  67  mM  Tris-HCI,  pH  8.8, 16.6  mM 
ammonium  sulfate,  6.7  mM  magnesium  chloride,  1  mM  p-mercaptoethanol, 
6%  dimethyl  sulphoxide,  100  pM  each  of  dATP,  dGTP,  dCTP  and  dTTP, 
radiactive  dCTP  (0.25  pi  of  a32P-  dCTP  (10  pCi/pl),  Amersham)  for  labeling, 

20  pM  each  of  the  primers,  50  ng  of  cDNA  template  and  2.5  Units  of  Platinum 
Taq  (Invitrogen).  An  initial  denaturation  at  94°C  for  2  minutes  was  followed  by 
30  cycles,  each  carried  out  at  94°C  for  30  seconds,  57°C  for  1  minute,  and 
70°C  for  1  minute  and  20  secondsrand  one  final  extension  cycle  at  70°C  for 
10  minutes  to  facilitate  TA  cloning  into  pCR2.1  (Invitrogen). 

TEGD  gel  electrophoresis  and  recovery  of  DNA  bands. 

The  samples  from  the  degenerate  RT-PCR  of  the  Smad  genes  were 
loaded  onto  a  4.5%  denaturing  polyacrylamide  gel  after  a  2  minute 
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denaturation  step  at  95°C.  Electrophoresis  was  performed  in  a  Genomyx  LR 
analyzer  (Beckman  Coulter)  for  4.5  hrs  at  80  Watts  with  constant  power 
(voltage  not  to  exceed  2500  volts).  The  gel  was  dried  and  autoradiography 
performed  on  the  gel.  DNA  bands  of  interest  on  the  gel  were  oriented  using 
the  autoradiogram,  cut  out  of  the  gel  and  isolated  by  soaking  the  gel  slice  in 
1XTE  buffer,  freezing  at  -80°C  for  30  minutes,  heating  at  60°C  for  5  minutes 
and  spinning  at  high  speed  to  separate  gel  fragments  from  the  aqueous 
phase  containing  DNA.  The  DNA  fragments  were  ethanol  precipitated  and 
isolated  using  conventional  methods  and  TA  cloned  (Invitrogen)  for 
sequencing. 

DNA  sequencing. 

DNA  sequence  analysis  was  performed  using  the  Genomyx  LR  analyzer 
(Beckman  Coulter).  The  cycle  sequencing  procedure  used  in  these  studies 
utilized  33P  ddNTPs  (Amersham)  along  with  the  ThermoSequanase  kit  (USB, 
Cincinnati,  OH)  as  previously  described  (20). 

Genomic  DNA  Isolation. 

Genomic  DNA  from  cell  lines  and  tumors  were  isolated  using  the  DNeasy 
Tissue  Kit  (QIAGEN)  according  to  the  manufacturer’s  instructions. 

Homozygous  deletion  analysis  of  Smad8. 

Radiolabeled  microsatellite  markers,  D13S927  and  D13S928,  that  are 
localized  at  the  beginning  and  end,  respectively,  of  the  Smad8  gene  in  its 
genomic  contig  and  gene  specific  primers  encompassing  the  first  ( Smad8  EX- 
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IF:  5-GAAAC AT GT GAGGAACAGC AGC-3'  and  Smad8  EX-1  R:  5'- 
CGAGACAGCGGCT GCAGCAGCG-3')  and  the  second  exons  ( Smad8  EX- 
2F:  5-GCCT G GTT CTGTTGCTCAGGCT G-3'  and  Smad8  EX-2R:  5'- 
GTGTTCCTGTGGCATTCAGGC-3')  of  the  Smad8  gene  were  used  in  PCR 
amplifications  and  gel  electrophoretic  analysis  to  determine  deletion  of  this 
genomic  region  (12). 


Analysis  of  gene  expression  using  semi-quantitative  RT-PCR. 

Total  RNA  prepared  from  samples  was  used  for  cDNA  synthesis  and  PCR 
amplification  was  done  essentially  as  previously  described  (20).  The  gene 
specific  primer  pairs  used  in  the  analysis  of  the  indicated  specific  Smad  genes 
and  the  p-actin  gene  used  for  standardization  to  normalize  the  abundance  of 
the  various  transcripts  analyzed  are  as  follows: 

Smadl-F:  5’-CCACT  GG  AAT  GCT  GT  GAGTTT CC-3’ 

Smad  1  -R:  5-GT AAGCT  CAT AGACT  GTCT CAAAT  CC-3’ 

Smad2-F:  5’-GGTAAG AACAT GT CCAT CTT GCC-3’  (20) 

Smad2-R:  5-CATGGG ACTT GATT G GT G AAGC-3’  (20) 

Smad3-F:  5’-CGGGCC  AT  GGAGCT  GT  GT  GAGTT  CG-3’ 

Smad3-R:  5-CGGGT  CAACTGGT AGACAGCCT  C-3’ 

Smad4-F:  5’-GG ACAAT AT GTCTATT ACG AATAC-3’  (20) 

Smad4-R:  5’-TTTATAAACAGGATTGTATTTTGTAGTCC-3’  (20) 

Smad5-F:  5’-GT AT CAACCCATACCACT AT AAG  AG-3’ 

Smad5-R:  5'-CAGAGGGGAGCCCATCTGAGTAAG-3’ 

Smad7-F:  5’-GGT  GCGAGGT  GCCAAAT  GT  CACC-3’ 

Smad7-R:  5’-GATGAACTGGCGGGTGTAGCAC-3’ 
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5'-CT  CTTAT  GCACT  CCACCACCCCCAT C-3’ 

Smad8-R:  5'-CTT AAGACAT  G  ACT  GTTAAGACACTG-3’ 

p-Actin-F:  5’-ACACT  GT  GCCCAT  CTACGAGG-3’ 

p-Actin-R:  5’-AGGGGCCGGACTCGTCATACT-3’ 

The  relative  abundance  of  the  various  Smacf  gene-specific  PCR  products 
was  normalized  to  p-Actin  or  other  unaffected  Smads  by  comparative 
abundance  of  the  products  using  densitometry. 

Processing  of  genomic  DNA  for  the  evaluation  of  methylation  status. 

For  bisulfite  sequencing  and  the  MSP  assay,  genomic  DNA  was  isolated 
from  cell  lines  and  primary  tumors  using  the  QIAGEN  DNeasy  Tissue  Kit. 
Genomic  DNA  was  subjected  to  a  deamination  reaction  by  incubation  with 
sodium  bisulfite  essentially  as  previously  described  (21).  In  brief,  0.5  to  2  pg 
genomic  DNA  was  denatured  with  2  M  NaOH  for  10  min,  followed  by  bisulfite 
modification  by  treatment  with  freshly  prepared  10  mM  hydroquinone  and  3  M 
sodium  bisulfite,  pH  5.0  (Sigma),  which  converts  unmethylated  cytosines  to 
uracil  but  does  not  change  methylated  cytosines.  Each  reaction  was  overlaid 
with  mineral  oil  and  incubated  at  50°C  for  16-20  hours.  After  treatment,  the 
modified  DNA  was  purified  using  a  Wizard  DNA  purification  kit  (Promega, 
Madison,  Wisconsin),  followed  by  desulfonation  by  treating  with  3  M  NaOH. 
The  ethanol  precipitated  purified  DNA  pellet  was  dissolved  in  30  pi  of  distilled 
water. 

Bisulfite  sequencing. 
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The  intron  1  region  of  the  Smad8  gene  containing  CpG  islands  was  first 
PCR  amplified  from  bisulfite  modified  DNA  (50-100  ng)  using  gene  specific 
primers  (5-GAAATAT  GT  GAGGAATAGT AGTTT AG-3’  and  5’- 
CCACT CAT CCCT CCCCCACCCAAAT C-3’)  and  the  product  was  gel  purified. 
Genomic  sequencing  of  the  Smad8  gene-specific  PCR  product  was 
accomplished  by  using  the  DNA  sequencing  primer,  5'- 
GTAAGTAGGGTTTTTTGGT-3’,  along  with  33P  ddNTPs  and  the 
ThermoSequanase  kit  (USB,  Cincinnati,  OH)  as  previously  described  (20). 

Methylation-Specic  PCR  (MSP). 

The  methylation  status  of  the  Smad8  promoter  region  was  also  analyzed 
by  MSP  with  the  use  of  primers  designed  for  the  amplification  of  defined  CpG 
islands  containing  DNA  sequences  of  either  unmethylated  or  methylated  DNA 
(21 ).  Sequences  of  the  forward  (F)  and  reverse  (R)  MSP  primers  to 
distinguish  the  methylated  (M)  and  unmethylated  (U)  genomic  DNA  used  in 
this  study  were  as  follows:  5’-GATGTGAGGTGAl  I  IATGTAGT-3’  (Smad8U- 
F)  and  5’-CACAACAACCTACAACTCAATTCCCT-3’  (Smad8U-R),  and  5- 
GACGCGAGGCGATTTACG-3’  (Smad8M-F)  and  5- 
CGACCACGTACGCGAAAACTCGCG-3’  (Smad8M-R).  PCR  conditions  were 
as  follows:  94°C  for  2  min,  35  cycles  of  94°C  for  30  sec,  58°C  for  30  sec,  70°C 
for  40  sec,  followed  by  a  final  extension  at  70°C  for  10  min.  A 10  pL  sample 
of  each  PCR  product  was  mixed  with  1  X  loading  buffer  and  analyzed  by 
electrophoresis  on  a  nondenaturing  8%  polyacrylamide  gel  and  visualized  by 
staining  with  ethidium  bromide. 
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5’-Aza-2’  deoxycytidine  and  TSA  treatment. 

HTB129,  MDAMB468,  MDAMB231,  CaCo2,  H441,  CCL230  and  HT29 
cells  were  incubated  in  culture  medium  with  and  without  5’-Aza-2’ 
deoxycytidine  (Sigma)  at  a  concentration  of  1-5  pM  for  7  days  or  with  300  nM 
trichostatin  A  (TSA)  for  24hrs.  To  assess  the  effect  of  a  combination  of  5’-Aza- 
2’  deoxycytidine  and  TSA,  cells  were  exposed  sequentially  for  7  days  to  5’- 
Aza-2’  deoxycytidine  and  then  to  TSA  for  an  additional  24  hrs.  Total  RNA  was 
isolated  and  Smad8  expression  was  determined  by  RT-PCR  using  the 
primers  Smad8-1F:  5’-CAGCTCAGCCTCCTGGCCAAG-3’  and  Smad8-1R:  5’- 
GAGGAAGCCT  GG  AAT  GTCT  C-3’ . 

RESULTS 

TEGD  and  signature  banding  pattern  of  the  Smads. 

Members  of  the  Smad  family  of  genes  have  highly  homologous  amino  acid 
sequences  at  their  N-  and  C-  tertninal  regions  (MH1  and  MH2-domains, 
respectively),  which  are  separated  by  a  highly  divergent  linker  region  rich  in 
proline,  serine  and  threonine  (1 , 2).  These  regions  may  have  arisen  from 
divergence  due  to  functional  specificities  from  an  ancesteral  unit  of  activity 
that  has  maintained  some  degree  of  evolutionary  conservation  at  the  level  of 
the  protein.  The  examination  of  the  MH  domains  from  various  Smad  genes 
indicated  that  there  is  identity  and  conservation  among  amino  acid  residues  at 
defined  positions,  which  is  consistent  with  critical  structural  features  required 
for  the  function  of  these  proteins.  The  sequence  conservation  at  the  amino 
acid  level  is  also  reflected  at  the  DNA  sequence  level.  Despite,  their  similarity 
at  the  level  of  the  genetic  code,  the  Smad  proteins  are  involved  in  a  wide 
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array  of  cellular  functions  as  they  not  only  play  roles  as  mediators,  inhibitors 
and  transcription  factors  of  the  Smad  signaling  pathways  but  also  mediate 
signaling  in  response  to  a  diverse  but  related  cytokines  (TGFp  family).  Even 
though  the  delineation  of  the  alterations  in  Smads  is  essential  for  the 
comprehension  of  the  molecular  basis  of  various  defective  processes,  the 
analysis  of  defects  in  individual  members  in  this  type  of  family  of  genes  poses 
a  formidable  task  for  efficient  detection  in  a  high  throughput  platform.  Success 
in  identifying  alterations  in  Smad  genes  could  be  expected  to  provide  critical 
information  necessary  for  deciphering  the  molecular  basis  of  their  functions. 
The  fact  that  the  Smad  genes  contain  two  distinct  highly  conserved  regions 
separated  by  a  highly  variable  intervening  linker  region  allowed  us  to  develop 
a  novel  screening  strategy  to  simultaneously  analyze  all  the  known  members 
of  this  family  (Figure  1  A). 

We  have  designed  degenerate  oligonucleotide  primers  corresponding  to 
the  conserved  regions  of  the  Srftad  family  of  genes  based  on  the  preservation 
of  codon  degeneracy  and  conserved  amino  acids  at  a  given  position  among 
the  known  Smads  for  PCR  amplification  of  the  cDNA  templates.  PCR 
amplification  in  the  presence  of  radiolabeled  nucleotides,  and  the  subsequent 
analysis  of  the  products  using  a  denaturing  polyacrylamide  gel 
electrophoresis  revealed  distinct  bands  on  the  gel  (Figure  1  A  &  B).  We 
recovered  the  distinct  bands  corresponding  to  the  PCR  products  generated 
using  Smad-specific  degenerate  primers  and  sequenced  them  using  primers 
that  are  specific  for  the  predicted  Smad  gene(s).  The  bands  corresponding  to 
the  1200,  960,  840,  680  and  570  base  pairs  (bp)  PCR  products  were  found  to 
be  identical  to  the  cDNA  sequences  for  Smad4,  Smadl  and  Smad5,  Smad2, 
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Smad3  and  Smad8,  and  Smad6  and  Smad7,  respectively,  as  predicted  from 
their  estimated  sizes  and  sequences  (1 ,  2;  Figure  1 B).  These  results 
suggested  to  us  that  once  the  signature  banding  pattern  (SBP)  of  the  targeted 
expression  gene  display  is  optimized  and  established,  such  as  in  this  case 
with  the  Smad  family  of  genes,  repeat  analysis  of  gene  expression  in  tissues 
or  other  samples  of  unknown  origin  could  be  easily  adopted  for  a  routine  high 
throughput  analysis.  Although  we  generated  and  analyzed  radiolabeled  PCR 
products  in  these  initial  studies,  one  could  also  achieve  the  same  results 
using  fluorescently  or  radioactively  end-labeled  primers  for  PCR  amplification. 

Validation  of  Smad  expression  patterns  determined  from  TEGD. 

We  confirmed  the  presence  or  absence  of  Smad  expression  determined 
from  TEGD  using  gene  specific  primers  by  semi-quantitative  RT-PCR  (Figure 
1C).  The  expression  patterns  of  the  various  Smads  detected  by  TEGD 
remained  consistent  with  semi-qUantitative  RT-PCR  results.  Most  of  the 
Smads  were  expressed  in  all  the  tissue  types  that  we  have  analyzed, 
however,  some  Smad  expression  was  lost  in  the  liver  and  was  decreased  to 
barely  detectable  levels  in  the  bone  marrow  and  uterus  (Figure  1C).  These 
results  indicated  to  us  that  TEGD  could  be  used  as  a  tool  for  initial  diagnostic 
high  throughput  evaluations  to  determine  Smad  gene  expression  patterns 
simultaneously  and  with  a  high  degree  of  efficiency.  Thus,  TEGD  can  be 
regarded  as  a  highly  improved  alternate  method  that  may  substitute  for  the 
traditional  multiplex  PCR  technique  due  to  its  increased  level  of  sensitivity, 
ability  to  discriminate  between  genes  that  are  closely  related  at  their  DNA 
sequence  and  the  low  level  of  cDNA  template  required  for  the  analysis. 
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Differentially  spliced  variants  of  the  Smads. 

TEGD  also  enabled  us  to  identify  the  various  differentially  spliced  forms  of 
the  Smad2,  Smad3,  Smad5,  and  Smad8 genes  (Figures  IB  &  C;  data  not 
shown).  Alternatively  spliced  variants  of  Smad2  with  a  deletion  of  exon3 
(Smac/2Aexon3),  Smad3  with  deletions  of  both  exons  3  and  7 
(S/77ac/3Aexon3  Aexon7),  Smad5  with  a  deletion  of  exon3  ( Smad5Aexon3 ) 
and  Smad8  with  deletions  of  either  exon3  (Smad8Aexon3)  or  both  exons  2 
and  3  (Smad8Aexon2Aexon3)  were  detected  in  our  analysis.  Although  one  of 
these  variants  (Smac/2Aexon3)  has  been  previously  reported,  the  existence  of 
the  others  has  been  recorded  for  the  first  time  in  this  study  (22).  However,  our 
study  did  not  verify  the  existence  of  two  previously  reported  alternatively 
spliced  forms  with  deletions  at  the  3’  ends,  potentially  due  to  the  placement  of 
the  TEGD  primers  inside  the  affected  sequence  of  these  alternatively  spliced 
forms  (23,  24).  The  encoded  proteins  of  Smad2,  Smad5  and  Smad8  resulting 
from  full-length  and  variant  transcripts  that  have  been  described  also  exhibit 
differences  in  their  biochemical  properties  (22-24).  Despite  these  findings,  the 
overall  significance  of  the  described  and  predicted  novel  spliced  forms  of 
Smads  reported  both  here  and  elsewhere  in  disease  phenotypes  including 
cancer  requires  further  studies  (22-24;  UCSC  genome  browser: 
http://aenome.ucsc.edu). 

TEGD  in  the  analysis  of  Smads  in  cancer. 

The  ability  to  simultaneously  probe  multiple  members  of  a  gene  family 
using  TEGD  prompted  us  to  apply  this  technique  to  analyze  differential 
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expression  patterns  of  the  various  Smads  in  cancer  to  validate  its  utility  for 
diagnostic  screening  (Figure  2A).  We  were  able  to  utilize  the  signature 
banding  patterns  established  with  the  normal  tissues  to  determine  the 
retention  or  loss  of  specific  DNA  bands  corresponding  to  the  defined  full- 
length  and  variant  transcripts  (Figures  1 B  and  2A).  The  TEGD  analysis  of  the 
Smad  genes  in  cancers  lead  us  to  conclude  that  there  is  a  significant  level  of 
loss  in  the  expression  of  Smad3  and  Smad8  in  colon  cancer  and  of  Smad8  in 
breast  cancer.  These  initial  observations  were  further  validated  by  analyzing 
the  expression  patterns  of  the  Smad8  gene  more  carefully  using  gene  specific 
primers  and  semi  quantitative  RT-PCR  (Figure  2B).  These  results  further 
confirmed  the  TEGD  data  and  provided  the  first  clues  to  suggest  that  the 
Smad8  gene  is  a  critical  target  for  loss  of  function  due  to  down  regulation  of 
gene  expression  in  31%  of  breast  and  colon  cancers  (Table  1).  The  analysis 
to  establish  the  significance  of  the  loss  of  Smad3  expression  in  colon  cancer 
will  be  dealt  with  in  greater  detail  elsewhere  (Cheng  and  Thiagalingam, 
unpublished  results).  In  conclusion,  TEGD  can  be  used  as  an  initial  dignostic 
tool  in  cancer  and  other  diseases  to  simultaneously  analyze  differential 
expression  patterns  of  genes  that  are  closely  related  at  the  level  of  their 
nucleotide  sequence. 

Molecular  mechanism  for  the  silencing  of  Smad8  expression. 

From  our  analysis,  loss  of  expression  of  the  Smad8  gene  was  estimated  to 
occur  in  nearly  a  third  of  both  breast  and  colon  cancers,  which  are  two  of  the 
leading  causes  of  cancer  deaths  in  women  and  in  general,  respectively 
(Figure  2B;  Tablel ).  Hence,  we  investigated  potential  mechanisms  for  the 
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loss  of  Smad8  gene  expression  in  cancer  due  to  the  high  level  of  significance 
of  this  alteration  with  respect  to  the  known  tumor  markers.  We  examined 
whether  genetic  alterations  such  as  chromosomal  deletions  affecting  the 
Smad8  gene  could  lead  to  the  loss  of  its  expression  by  homozygous  deletion 
analyses.  We  used  microsatellite  markers  corresponding  to  the  Smad8  gene 
based  on  the  genomic  contig  as  well  as  by  genomic  PCR  using  primers  that 
amplified  the  genomic  region  corresponding  to  the  first  two  exons  of  the 
Smad8  gene.  These  experiments  indicated  that  gross  genomic  deletions  are 
apparently  not  the  major  mechanism  of  Smad8  inactivation  in  the  affected 
cancers  (data  not  shown).  Therefore,  we  considered  epigenetic  silencing  as 
an  alternate  mechanism  for  Smad8  gene  silencing. 

The  genomic  sequence  of  the  Smad8  gene  was  inspected  for  the 
presence  of  CpG  islands  that  may  be  the  targets  of  DNA  hypermethylation 
and  associated  chromatin  modification  effects  for  their  involvement  in  the 
silencing  of  Smad8  gene  expression.  Several  CpG  islands  in  the  upstream 
promoter  as  well  as  in  the  first  intronic  region  of  the  Smad8  gene  were  tested 
as  likely  candidate  regions  that  could  be  critical  for  differential  DNA 
methylation  patterns  coinciding  with  the  loss  of  Smad8  expression  (data  not 
shown,  Figure  3).  DNA  sequence  analysis  of  the  bisulfite  treated  genomic 
DNA  revealed  that  CpG  islands  localized  to  nucleotides  3541028  to  35410583 
(Chromosome  13q12-14  on  the  reverse  strand  between  Rb  and  BRCA2 ; 
UCSC  genome  browser:  http://qenome.ucsc.edu)  in  the  first  intron  of  the 
Smad8  gene  are  only  methylated  in  cancers  that  exhibit  loss  of  expression 
(Figure  3A  &  B).  Methylation  specificic  PCR  (MSP)  was  carried  out  using 
primers  designed  to  these  corresponding  differentially  methylated  regions  and 
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the  results  further  confirmed  that  the  Smad8  gene  is  silenced  in  cancers  due 
to  DNA  hypermethylation  affecting  CpG  islands  in  the  first  intron  of  the  Smad8 
gene  (Figure  3C). 

DNA  hypermethylation  and  Smad8  expression  in  cancer. 

To  directly  determine  the  physiological  significance,  the  role(s)  of  apparent 
epigenetic  DNA  methylation  by  itself  or  in  combination  with  histone 
acetylation/  deacetylation  on  differential  regulation  of  Smad8  expression  in 
cancers  was  examined.  We  chose  six  cell  lines  derived  from  breast,  colon 
and  lung  cancers  (HTB129,  HT29,  CaCo2,  CCL253,  MDAMB468  and  H441) 
that  exhibited  loss  of  Smad8  expression,  and  one  cell  line  (MDAMB231) 
which  retained  Smad8  expression  as  a  control,  and  examined  the  effects  of 
5’-aza-2’-deoxycytidine  (5Aza-dC;  a  DNA  demethylating  agent)  and/or 
trichostatin  A  (an  inhibitor  of  histone  deacetylases)  on  Smad8  expression.  A 
substantial  increase  in  Smad8  expression  was  observed  with  5Aza-dC 
treatment  in  all  of  the  cell  lines,  which  were  previously  determined  to  exhibit 
DNA  hypermethylation-mediated  gene  silencing  of  Smad8  (Figure  4A). 
Trichostatin  A  by  itself  caused  only  a  slight  increase  in  the  levels  of  the 
transcript  in  two  of  the  tested  cell  lines  (CaCo2  and  CCL253)  but  had  no  effect 
in  the  majority  of  the  tested  cell  lines.  However,  there  was  a  slight  up 
regulation  of  Smad8  expression  in  the  presence  of  both  drugs  (Figure  4A). 
MSP  analysis  of  the  target  CpG  islands  in  intronl  of  the  Smad8  regulatory 
regions  that  were  differentially  methylated  in  affected  and  control  cell  lines 
revealed  that  demethylation  due  to  5Aza-dC  treatment  accompanies  a 
corresponding  increase  in  Smad8  gene  expression  (Figure  4B).  These 
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observations  strongly  support  the  notion  that  the  loss  of  Smad8  expression  in 
cancers  is  primarily  mediated  by  hypermethylation  of  c/'s-regulatory  CpG 
islands  of  the  gene. 

DISCUSSION 

The  analysis  of  highly  homologous  members  of  a  family  of  genes  to  detect 
and  establish  differential  gene  expression  patterns  as  well  as  the  genetic 
alterations  responsible  for  cancer  and  other  diseases  with  limited  amounts  of 
clinical  sample  has  remained  a  formidable  task.  Efficient  methods  to 
simultaneously  analyze  the  closely  related  yet  functionally  divergent  genes 
belonging  to  families  would  not  only  be  important  in  accurate  diagnosis  and 
prognostic  evaluation  of  a  disease  but  could  also  be  exploited  for  the 
identification  of  pharmacogenetic  targets  to  customize  therapy.  We  propose 
that  the  TEGD  technique  described  in  this  article  can  be  effectively  utilized  to 
analyze  families  of  genes  that  contain  at  least  two  stretches  of  conserved 
regions,  which  are  separated  by  a  divergent  linker  region  of  variable  length. 
TEGD  provides  a  distinct  advantage  over  techniques  such  as  differential 
display  (DD),  a  comparable  methodology,  which  has  been  adopted  for  the 
simultaneous  analysis  of  multiple  genes,  due  to  the  latter’s  inability  to  detect 
differential  gene  expression  patterns  of  targeted  and  defined  genes. 
Furthermore,  even  an  improved  version  of  DD  designed  to  analyze  related 
genes  (e.g.,  kinases)  still  fell  short  of  efficiently  establishing  distinct 
expression  patterns  of  the  related  genes  and  failed  to  identify  novel  genes 
with  different  functional  roles  (25-28). 

On  the  other  hand,  with  TEGD,  once  a  signature  banding  pattern  of  the 
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targeted  expressed  gene  display  is  optimized  and  established  with  an  array  of 
different  normal  tissues,  such  as  in  this  case  with  the  Smad  family  of  genes, 
repeat  analysis  of  gene  expression  of  samples  of  unknown  origin  could  be 
easily  carried  out  in  a  routine  high  throughput  manner  (Figures  1  and  2).  We 
believe  that  the  TEGD  technique  should  sufficiently  address  the  dilemma  of 
efficient  simultaneous  expression  pattern  analysis  of  related  genes  with 
relatively  minute  amounts  of  samples  in  clinical  and  investigational  research 
settings.  The  development  of  an  algorithm  to  predict  the  suitability  of  the 
applications  of  TEGD  based  on  the  presence  of  two  distinct  homologous 
regions  separated  by  an  intervening  variable  region  that  would  enable  the 
establishment  of  signature  banding  patterns  from  the  available  sequences  of 
already  identified  genes  or  ESTs  is  in  progress.  We  believe  that  TEGD  has 
the  potential  to  advance  the  ability  to  probe  gene  families  for  genetic  and 
epigenetic  defects  to  a  new  level  of  sophistication  and  will  find  general  use  in 
the  future.  The  application  of  the  TEGD  technique  to  simultaneously  analyze 
multiple  members  of  the  Smad  family  of  genes  has  not  only  validated  the 
enormous  advantage  of  the  technique  as  an  initial  diagnostic  tool  but  also 
illustrates  an  efficient  way  to  identify  novel  genes  that  are  closely  related  at 
the  level  of  their  nucleotide  sequence,  to  identify  splice  variants  of  a  gene  as 
well  as  to  detect  their  altered  expression  patterns. 

Our  survey  of  the  various  Smad  genes  using  the  novel  TEGD  technique 
described  in  this  article  enabled  us  to  obtain  the  first  clues  in  identifying  the 
Smad8  gene  as  an  important  target  for  loss  of  expression  in  multiple  types  of 
cancers,  including  nearly  31%  of  breast  and  colon  cancers.  This  level  of 
alteration  is  even  more  frequent  than  that  of  the  Smad4  gene,  the  most 
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frequent  target  for  genetic  inactivation  of  the  known  Smad  signaling  genes  in 
colon  cancer,  and  is  also  more  frequent  than  the  HER/neu  gene  amplification, 
the  most  celebrated  tumor  marker  for  breast  cancer,  which  occurs  in  about 
20%-30%  of  breast  cancer  cases  (29). 

Thus,  the  data  presented  in  this  article  provides  the  first  direct  evidence 
that  silencing  of  gene  expression  via  DNA  hypermethylation  of  the  Smad8 
gene  could  be  an  important  event  in  tumorigenesis  of  several  cancers 
including  one  third  of  breast  and  colon  cancers.  It  is  interesting  to  note  that 
Smad8  is  apparently  the  major  target  for  loss  of  function  among  the  Smad 
genes  in  breast  cancer  and  is  a  R-Smad  which  becomes  phosphorylated 
during  BMP  signaling  events  and  modulates  BMP-responsive  genes  including 
those  that  may  affect  bone  homeostasis  (30-34;  Figure  5).  Additionally  Smad 
signaling  events  via  the  BMP  cytokines  are  also  implicated  in  other  signaling 
events  that  regulate  biological  processes,  including  cell  differentiation, 
proliferation,  determination  of  ceH  fate  during  embryogenesis,  cell  adhesion, 
cell  death,  angiogenesis,  metastasis  and  immunosuppression  (1,  2;  Figure  5). 
Although  it  is  intriguing  that  metastasis  to  bone  is  often  associated  with 
advanced  stage  breast  and  other  cancers,  further  studies  would  be  required 
to  understand  whether  metastatic  breast  cancer  cells  defective  in  Smad8 
signaling  could  be  responsible  for  causing  an  imbalance  in  normal  bone 
homeostasis  by  enhancing  osteoclastic  bone  resorption,  leading  to  osteolytic 
lesions  within  the  bone  (35-38). 

Additionally,  despite  the  fact  that  inactivation  of  the  Smad2  and  Smad4 
genes  due  to  intrageneic  mutations  and  homozygous  deletions  has  been 
reported  in  nearly  20%  of  colorectal  cancers,  evidence  for  genetic  or 
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epigenetic  inactivation  of  other  Smad  gene  targets  at  significant  levels  had 
remained  elusive  until  this  report  (2, 20).  The  loss  of  expression  of  Smad8  in 
nearly  31%  of  colon  cancers  is  more  significant  than  any  other  Smad 
alterations  known  to  date.  Determination  of  whether  the  affected  cells  play  a 
critical  role  in  tumorigenesis  by  a  mechanism  similar  to  that  in  breast  cancer 
requires  further  study.  Interestingly,  the  presence  of  germline  mutations  in  the 
BMP  receptor  1 A  in  juvenile  polyposis,  which  increase  the  risk  of  developing 
gastrointestinal  cancers,  suggests  that  inactivation  of  BMP  signaling  may  play 
a  critical  role  in  colon  cancer  (39, 40).  Despite  the  fact  that  the  elucidation  of 
BMP-mediated  signaling  pathways  in  which  Smad8  is  a  critical  mediator  is  still 
in  its  infancy,  these  studies  clearly  provide  the  incentive  for  further 
investigations  that  may  help  gain  a  better  understanding  of  the  effects  of 
Smad8  inactivation  in  cancer  and  could  pave  the  way  for  the  exploration  of  its 
potential  utility  in  diagnosis,  prognosis  and  designing  of  therapeutic 
modalities.  * 
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Table  1.  Altered  expression  of  Smad  genes  in  cancers. 


Cancer 

Total  #  of  samples 

Samples  with  loss  of  Smad8 
expression  (%) 

Breast 

35 

11/35(31) 

Colon 

41 

13/41  (31) 

Esophagus 

4 

0/4  (0) 

Head  &  Neck 

4 

2/4  (50) 

Lung 

19 

1/19  (5) 

Pancreas 

3 

2/3  (65) 

Prostate 

4 

3/4  (75) 

Ovary 

2 

1/2  (50) 

Stomach 

4 

2/4  (50) 

4 
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Figure  legends: 

Figure  1 .  Targeted  expressed  gene  display  (TEGD)  and  tissue-wide  expression 
of  Smad  genes. 

A.  Schematic  representation  of  TEGD  for  the  Smad  family  of  genes. 

MH1  and  MH2  indicate  highly  homologous  regions  in  the  amino  acid  as  well  as 
DNA  sequence  among  the  various  Smad  gene  family  members.  The  forward 
and  reverse  primers  for  PCR  amplification  of  the  cDNA  were  designed  in  the 
conserved  regions  as  indicated.  The  radiolabeled  PCR  products  were  analyzed 
by  denaturing  acrylamide  gel  electrophoresis.  A  typical  signature  banding 
pattern  (SBP)  of  the  various  Smads  is  indicated  in  the  lower  panel. 

B. TEGD  analysis  of  the  Smad  family  of  genes  in  various  tissue  types. 

PCR  products  for  Smads  using  degenerate  primers  were  analyzed  by  TEGD. 
Lanes  1-17  correspond  to  PCR  products  generated  using  cDNA  templates 
from  brain,  lung,  stomach,  heart,  liver,  spleen,  kidney,  colon,  bone  marrow, 
small  intestine,  trachea,  prostate,  uterus,  thymus,  testis,  skeletal  muscle,  and 
mammary  gland,  respectively.  The  lines  on  the  right  hand  panel  point  to 
distinct  PCR  products.  The  approximate  size  of  PCR  products  in  base  pairs 
(bp)  is  indicated  on  the  left  panel.  The  positions  of  various  Smad  genes  and 
their  variants  as  identified  from  sequence  analysis  are  indicated  on  the  right 
panel. 

C.  RT-PCR  analysis  of  Smad  genes. 

Semi-quantitative  RT-PCR  analysis  of  the  indicated  Smad  genes  was  carried 
out  as  described  under  materials  and  methods.  The  cDNA  template  was 
derived  from  total  RNA  from  normal  tissues  of  brain,  lung,  heart,  liver,  bone 
marrow,  kidney,  spleen,  thymus*  prostate,  testis,  uterus,  small  intestine, 
mammary  gland,  skeletal  muscle,  stomach  and  colon,  lanes  1-16, 
respectively. 


Figure  2.  Analysis  of  Smad  expression  in  cancer. 

A.  TEGD  analysis  of  Smad  genes  in  cancer. 

PCR  products  of  Smads  generated  using  degenerate  primers  as  described 
under  Figure  1  were  obtained  from  differerent  cancers  and  analyzed  by 
TEGD.  The  cDNA  templates  used  fn  reactions  analyzed  on  lanes  NC,  NB  & 
NS  are  from  normal  cells  from  colon,  breast  and  stomach  tissues;  C1-7,  B1-7 
and  SI -4  are  from  colon,  breast  and  gastric  cancers,  respectively.  The  arrows 
point  to  distinct  PCR  products  that  were  absent  compared  to  the  normal 
control.  The  positions  of  various  Smad  genes  and  their  variants  as  identified 
from  sequence  analysis  are  indicated  on  the  right  panel. 

B.  Smad8  expression  in  cancer  cell  lines  and  tumors. 

Total  RNA  prepared  from  cell  lines  and  tumors  from  the  lung,  breast  and 
colon  cancers  were  analyzed  by  RT-PCR  (Lanes  1-14).  Lane  1  in  each  of  the 
different  RT-PCR  panels  corresponds  to  the  normal  sample  of  the  indicated 
tissue  type.  Smad8a,  Smad8/3 and  Smad8yare  three  of  the  major 


27 


Cheng  et  ai, 

differentially  spliced  forms  of  Smad8,  which  correspond  to  transcripts  that  are 
full-length,  that  exhibit  deletion  of  exon  2,  and  that  exhibit  deletions  of  exons  2 
and  3,  respectively.  Analysis  of  the  fi-Actin  gene  is  used  for  normalization  and 
quantitation  of  the  expression  of  Smad8. 

Figure  3.  Epigenetic  gene  silencing  of  the  Smad8  due  to  altered  DNA 
methylation. 

A.  Schematic  drawing  of  the  landscape  of  CpG  island  methylation  patterns  in 
the  region  of  genomic  DNA  from  upstream  of  exon  1  through  exon  2  of  the 
Smad8  gene.  Boxes  denote  exons.  The  flag  represents  the  ATG 
corresponding  to  the  first  methionine  of  the  predicted  peptide.  Vertical  lines 
indicate  CpG  islands  in  the  DNA  sequence.  Open  circles  represent 
unmethylated  cytosines  whereas  filled  circles  represent  methylated  cytosines 
as  determined  by  bisulfite  sequencing.  The  circles  above  the  horizontal  line 
indicate  the  methylation  pattern  observed  in  the  CpG  islands  of  the  cell  lines 
that  express  Smad8.  The  circles  below  the  line  indicate  the  methylation 
pattern  of  the  CpG  islands  from  samples,  which  lacked  Smad8 
expression.  The  nucleotide  sequence  of  the  DNA  within  the  dotted  lines  is 
shown  with  the  asterisks  (*)  indicating  CpG  islands. 

B.  Bisulfite  sequence  analysis  of  the  indicated  CpG  islands  of  intron  1  of  the 
Smad8  gene  in  the  cell  lines  that  are  either  proficient  (++)  or  deficient  (-)  in 
Smad8  expression.  Cell  lines  proficient  for  Smad8  expression  have  no  bands 
in  the  C  lane  indicative  of  conversion  of  unmethylated  cytosines  to  uracil  upon 
bisulfite  treatment. 

C.  MSP  (Methylation  Specific  PCR)  analysis  of  the  various  cancers  that  have 
lost  or  retained  Smad8  expression.  The  MSP  products  in  lanes  U  and  lanes  M 
indicate  the  presence  of  unmethylated  and  methylated  templates, 
respectively.  Placental  DNA  (PDNA)  and  in  vitro  methylated  DNA  (IVM)  serve 
as  negative  and  positive  controls. 

Figure  4.  The  effects  of  DNA  demethylation  and  inhibition  of  histone 
deacetylases  on  SMAD8  gene  expression. 

The  indicated  cell  lines  were  treated  with  5-AZA-dC  for  7  days  or  with  TSA  for 
24hrs.  To  assess  the  effect  of  both  5-AZA-dC  and  TSA  simultaneously,  cells 
were  exposed  sequentially  for  7  days  to  5-AZA-dC  and  subsequently  to  TSA 
for  an  additional  24  hrs.  Total  RNA  and  genomic  DNA  were  isolated  and 
Smad8  expression  and  DNA  hypermethylation  were  determined  by  (A)  RT- 
PCR  and  (B)  MSP  analysis,  respectively.  MDAMB231  cells  were  used  as  the 
positive  control. 

Figure  5.  A  model  for  the  Smad8  connection  to  cancer. 

BMP  signaling  is  initiated  by  the  association  between  BMPs  and  type  I  (Rl) 
and  type  II  (Rll)  heteromeric  receptors  which  follows  phosphorylation  of  the 
type  I  receptor  (Rl)  kinase  that  in  turn  phosphorylates  the  receptor-regulated 
Smads  (R-Smad),  such  as  Smad8  and  initiates  the  signaling  events.  The 
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phosphorylated  Smad8  forms  a  heteromeric  complex  with  the  common- 
mediator  Smad  (Co-Smad),  Smad4,  and  is  translocated  into  the  nucleus.  In 
the  nucleus,  the  Smad8/Smad4  hetero-oligomer  either  by  itself  or  by 
associating  with  heterologous  Smad-interacting  DNA  binding  proteins 
(SIDBP)  or  other  cofactors,  could  mediate  specific  transcriptional  activation  or 
repression  responses.  The  inhibitory  Smads  (l-Smad)  such  as  Smad6  and 
Smad7  are  able  to  compete  with  the  R-Smads  by  stably  binding  the  Rl  kinase 
or  by  preventing  association  of  R-Smads  with  the  Co-Smad,  effectively 
blocking  the  signaling  cascade.  There  are  numerous  other  signaling  pathways 
such  as  the  Ras-MEK  pathway  that  could  also  modulate  the  end  effects  by 
establishing  cross  talk  among  the  different  pathway  members.  BMP  signaling 
is  implicated  in  tumor  suppression,  bone  homeostasis,  angiogenesis  and 
metastasis. 
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Figure  2A.  Analysis  of  Smad  genes  expression  in  cancer. 
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Abstract:  The  epigenome  is  defined  by  DNA  methylation  patterns  and  the  as¬ 
sociated  posttranslational  modifications  of  histones.  This  histone  code  deter¬ 
mines  the  expression  status  of  individual  genes  dependent  upon  their 
localization  on  the  chromatin.  The  silencing  of  gene  expression  is  associated 
with  deacetylated  histones,  which  are  often  found  to  be  associated  with  regions 
of  DNA  methylation  as  well  as  methylation  at  the  lysine  4  residue  of  histone  3. 
In  contrast,  the  activation  of  gene  expression  is  associated  with  acetylated  his¬ 
tones  and  methylation  at  the  lysine  9  residue  of  histone  3.  The  histone  deacty- 
lases  play  a  major  role  in  keeping  the  balance  between  the  acetylated  and 
deacetylated  states  of  chromatin.  Histone  deacetylases  (HDACs)  are  divided 
into  three  classes:  class  I  HDACs  (HDACs  1, 2, 3,  and  8)  are  similar  to  the  yeast 
RPD3  protein  and  localize  to  the  nucleus;  class  II  HDACs  (HDACs  4,  5,  6,  7,  9, 
and  10)  are  homologous  to  the  yeast  HDA1  protein  and  are  found  in  both  the 
nucleus  and  cytoplasm;  and  class  III  HDACs  form  a  structurally  distinct  class 
of  NAD-dependent  enzymes  that  are  similar  to  the  yeast  SIR2  proteins.  Since 
inappropriate  silencing  of  critical  genes  can  result  in  one  or  both  hits  of  tumor 
suppressor  gene  (TSG)  inactivation  in  cancer,  theoretically  the  reactivation  of 
affected  TSGs  could  have  an  enormous  therapeutic  value  in  preventing  and 
treating  cancer.  Indeed,  several  HDAC  inhibitors  are  currently  being  devel¬ 
oped  and  tested  for  their  potency  in  cancer  chemotherapy.  Importantly,  these 
agents  are  also  potentially  applicable  to  chemoprevention  if  their  toxicity  can 
be  minimized.  Despite  the  toxic  side  effects  and  lack  of  specificity  of  some  of  the 
inhibitors,  progress  is  being  made.  With  the  elucidation  of  the  structures,  func¬ 
tions  and  modes  of  action  of  HDACs,  finding  agents  that  may  be  targeted  to 
specific  HDACs  and  potentially  reactivate  expression  of  only  a  defined  set  of  af¬ 
fected  genes  in  cancer  will  be  more  attainable. 

Keywords:  histone  deacetylases  (HDAC);  histone  code;  active  histone  code 
(AHC);  silenced  histone  code  (SHC);  histone  deacetylase  inhibitor  (HDACi); 
cancer  therapy 
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INTRODUCTION 

The  impending  completion  of  the  Human  Genome  Project  has  led  the  scientific 
community  to  the  cusp  of  identifying  every  gene  within  the  DNA  of  our  genome. 
However,  many  challenges  still  lie  ahead,  foremost  of  which  may  be  deciphering  the 
regulatory  cues  and  mechanisms  that  allow  these  genes  to  be  “turned  on”  or  “off’ 
depending  upon  the  intra-  and  extracellular  signals  a  cell  receives.  Eukaryotes  have 
evolved  a  complex  packaging  of  DNA  that  encumbers  transcription,  which  requires 
accessible  DNA  to  allow  transcription  factors  and  RNA  polymerase  to  bind  to  pro¬ 
moters.  Genomic  DNA  is  packaged  into  highly  ordered  structures  known  as  chroma¬ 
tin ,  which  is  composed  of  structural  subunits  called  nucleosomes.  Nucleosomes 
consist  of  146  base  pairs  of  DNA,  which  is  the  equivalent  of  two  superhelical  turns 
of  DNA  and  an  octamer  of  core  histone  proteins.  The  histones  have  numerous  sites 
where  posttranslational  modifications  can  occur,  and  it  has  been  proposed  that  the 
pattern  of  modifications  acts  as  an  information  code  that  regulates  processes  that  in¬ 
fluence  gene  transcription.  This  pattern  of  modifications  has  been  termed  the  histone 
code}'1  The  particular  pattern  of  histone  modifications  may  play  a  role  in  determin¬ 
ing  the  affinity  for  chromatin-associated  proteins,  which  determine  whether  the 
chromatin  takes  on  an  active  or  silent  state.  DNA  methylation  and  histone  modifica¬ 
tion  are  the  major  contributors  to  chromatin  modification,  which,  combined  with 
ATP -dependent  chromatin  remodeling,  is  the  principle  epigenetic  mechanism  by 
which  tissue-specific  gene  expression  patterns  and  global  gene  silencing  are  estab¬ 
lished  and  maintained.3 

This  review  will  discuss  one  of  the  important  chromatin-modifying  effects,  his¬ 
tone  deacetylation,  a  process  that  is  correlated  with  repression  of  gene  expression.  It 
is  common  knowledge  that  orderly  expression  of  appropriate  genes  at  optimal  levels 
is  central  to  the  maintenance  of  the  destined  differentiated  status  of  all  the  cells  that 
make  up  the  human  body,  while  any  alterations  or  inappropriate  levels  of  gene  ex¬ 
pression  may  lead  to  cancer.4  Therefore,  we  will  also  discuss  the  implications  of  the 
use  of  agents  that  influence  gene  expression  by  affecting  histone  acetylation  as 
promising  chemopreventive  or  chemotherapeutic  agents  in  cancer. 


THE  HISTONE  CODE 

The  basic  building  block  of  chromatin  is  the  nucleosome,  which  consists  of  146 
base  pairs  of  DNA  wrapped  around  an  octamer  of  histones  represented  by  two  copies 
each  of  histone  (H)  2  A,  H2B,  H3,  and  H4.5’6  Histones  are  basic  proteins  that  consist 
of  a  globular  domain  and  an  N-terminal  tail  that  protrudes  from  the  nucleosome.  Al¬ 
though  the  histones  are  some  of  the  most  evolutionarily  conserved  proteins,  they  are 
also  among  the  most  variable  in  terms  of  posttranslational  modifications.  The  his¬ 
tone  tails  emanating  from  the  nucleosome  are  unstructured  and  serve  as  targets  for 
characteristic  covalent  posttranslational  modifications,  including  acetylation,  phos¬ 
phorylation,  methylation,  sumoylation,  and  ubiquitination.  These  posttranslational 
modifications  determine  the  structure  and  pattern  of  chromatin  condensation  and  de¬ 
termine  the  histone  code  involved  in  gene  regulation.7  Cytogenetic  analysis  of  chro¬ 
matin  identified  euchromatin  and  heterochromatin,  where  the  heterochromatin  is  the 
portion  that  remains  deeply  stained  (heteropyknotic)  and  highly  condensed  during 


86 


ANNALS  NEW  YORK  ACADEMY  OF  SCIENCES 


cell  division.  The  heterochromatin  region  is  generally  rich  in  repetitive  DNA  se¬ 
quences  and  very  low  in  gene  density.  However,  the  extent  of  heterochromatin  of 
specific  regions  may  differ  in  different  individuals  or  tissue  types  and  may  be  deter¬ 
mined  by  a  complex  process  involving  factors  responsible  for  chromatin  remodeling. 
The  outcome  of  the  chromatin  remodeling  process,  the  histone  code,  apparently 
determines  a  mechanistic  basis  not  only  for  the  spreading  of  heterochromatin  but 
also  for  the  epigenetic  inheritance  of  the  silent  states  of  specific  regions  of  chroma¬ 
tin.  The  location  of  a  specific  gene  on  the  chromatin  may  eventually  determine 
whether  the  gene  is  either  expressed  or  silenced.  This  impact  of  gene  location  is 
known  as  position  effect  variegation  (PEV).  Heterochromatization  of  a  formerly  eu- 
chromatic  region  at  its  boundaries  may  have  an  enormous  impact  on  the  status  of 
gene  expression. 


HUMAN  HISTONE  DEACETYLASES 

In  the  mid-1960s,  Allfrey  and  his  colleagues  were  the  first  to  observe  histone 
acetylation  and  postulated  that  acetylation  of  core  histones  could  regulate  transcrip¬ 
tion.8  Histone  hyperacetylation  correlated  with  increased  transcription  and  hy- 
poacetylation  with  repression.  However,  it  was  not  until  the  early  1990s  that  the  role 
of  HDACs  in  this  regulation  came  to  prominence.  The  initial  observations  that  im¬ 
plicated  a  role  for  HDACs  in  transcriptional  regulation  came  from  a  screen  to  iden¬ 
tify  small  molecules  that  could  return  spindle-like  transformed  NIH3T3  cells  to  the 
normal  fibroblast-like  morphology.  An  epoxyketone-containing  cyclic  tetrapeptide, 
trapoxin,  was  identified  without  the  knowledge  of  what  proteins  this  molecule  was 
acting  on.9  Later,  it  was  discovered  that  cells  treated  with  trapoxin  had  hyperacety- 
lated  histones  and  that  this  molecule  inhibited  histone  deacetylation.10  It  was  not  un¬ 
til  1996,  however,  that  the  protein  target  for  trapoxin  was  identified  with  the  cloning 
of  the  first  histone  deacetylase.11  To  date,  18  HDACs  have  been  identified  in  hu¬ 
mans,  and  their  activities  have  been  implicated  in  transcription,12-17  cell  cycle  pro¬ 
gression,18-21  gene  silencing,22  differentiation,23,24  DNA  replication,19,25-28  and 
the  DNA  damage  response  29-32  One  question  that  often  arises  is  why  do  humans 
need  so  many  HDACs  and  what  are  the  roles  for  each  of  these  HDACs?  Clues  are 
starting  to  emerge,  as  well  as  many  more  questions.  Grunstein  and  colleagues  recent¬ 
ly  used  microarray  deacetylation  maps  in  yeast  to  determine  the  genome-wide  func¬ 
tions  of  yeast  deacetylases.  They  showed  that  Rpd3  and  Hdal  act  predominantly  on 
distinct  promoters  and  gene  classes  and  are  recruited  by  novel  mechanisms.  Hdal 
also  deacetylates  subtelomeric  domains,  which  contain  genes  involved  in  gluconeo- 
genesis,  growth  on  nonglucose  carbon  sources,  and  adverse  growth  conditions.  Sir2 
was  shown  to  deacetylate  subtelomeric  heterochromatin,  while  Hosl/Hos3  and  Hos2 
regulate  ribosomal  DNA  and  ribosomal  protein  genes.33  Researchers  have  set  out  to 
delineate  human  HDAC-mediated  events,  and  the  one  clear  observation  is  that 
HDAC  function  is  very  complex.  The  number  of  HDACs,  splice  variants  of  these 
HDACs,  proteins  that  associate  with  HDACs  either  alone  or  in  multiprotein  com¬ 
plexes,  and  posttranslational  modifications  such  as  phosphorylation  and  sumoyla- 
tion  all  play  a  role  in  regulating  the  specificity  of  HDAC  activity. 

The  HDACs  can  be  separated  into  three  classes  based  on  their  homology  to  yeast 
histone  deacetylases  (Table  1).  Class  I  HDACs  have  high  homology  to  the  yeast 
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TABLE  1.  Human  histone  deacetylase 


Histone 

deacetylase 

Amino  acids 

Sensitivity  to 
TSA* 

Chromosomal 

location 

Reference 

Class  I 

HDAC1 

482 

yes 

lp34.1 

11,  131,  132 

HDAC2 

488 

yes 

6q21 

131,  133 

HDAC3 

428 

yes 

5q3 1 . 1 — 5q3 1 .3 

34-36,  131 

HDAC8 

377 

yes 

Xq21.2-Xq21.3 
or  Xql3 

37-39 

Class  II 

HDAC4 

1084 

yes 

2q37 

59,  68,  134 

HDAC5 

1122 

yes 

17q21 

59,  60,  134 

HDAC6 

1215 

yes 

Xpll.23 

59,  68,  134 

HDAC7 

952 

yes 

I2q  13.1 

60 

HDAC9 

1011 

yes 

7  p  1 5  — p2 1 

60,61,68,  134 

HDAC10 

669 

yes 

22q  13.31— 13.33 

62, 135-137 

HDACll* 

347 

yes 

3p25.1 

70 

Class  III  (sirtuins) 

SIRT1 

747 

no 

10q22.2 

73 

SIRT2 

389 

no 

1 9q  1 3 

73,76 

SIRT3 

399 

no 

1 1  p  1 5.5 

73 

SIRT4 

314 

no 

12q 

73 

SIRT5 

310 

no 

6p22.3 

73 

SIRT6 

355 

no 

1 9p  1 3.3 

138 

SIRT7 

400 

no 

17q 

138 

aTSA,  trichostatin  A. 

^HDACI 1  has  properties  of  both  class  I  and  class  II  HDACs. 


RPD3  gene,  whereas  class  II  HDACs  are  homologous  to  the  Hdal  gene.  A  third  fam¬ 
ily,  class  III  HDACs,  were  identified  based  on  their  similarity  to  the  Sir2  gene. 

Class  J HDACS 

The  class  1  HDACs,  HDAC 1 ,  HDAC2,  HDAC3,  and  HDAC8, 34-39  all  share  a  cer- 
tain  degree  of  homology  to  the  yeast  RPD3  gene,  are  around  400-500  amino  acids 
long,  generally  localize  to  the  nucleus,  and  are  ubiquitously  expressed  in  many  hu¬ 
man  cell  lines  and  tissues.  All  four  members  have  a  deacetylase  catalytic  domain, 
and  HDAC1  and  HDAC2  have  a  C  terminal  RB  binding  motif  adjacent  to  a  basic  re¬ 
gion.  Each  class  I  HDAC  has  been  mapped  to  a  chromosomal  location  (Table  1);  in¬ 
terestingly,  HDAC  8  was  shown  to  localize  to  Xql3  by  FISH  using  the  HDAC8 
cDNA  as  the  probe,39  whereas  another  group  using  radiation  hybrid  mapping  report¬ 
ed  the  location  at  Xq21.2-Xq21.3,38  raising  the  possibility  that  a  gene  duplication 
event  may  have  occurred.  All  four  members  have  been  shown  to  be  sensitive  to  his- 
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tone  deacetylase-specific  inhibitors.  Interestingly,  the  messenger  RNAs  of  all  but 
HDAC8  are  upregulated  in  response  to  trichostatin  A  treatment,  suggesting  that 
HDAC  inhibitors  (HDACi)  may  trigger  an  autoregulatory  loop  that  results  in  a  com¬ 
pensatory  feedback  pathway.40 

It  is  now  becoming  clear  that  these  HDACs  are  parts  of  large  protein  complexes 
in  vivo  that  direct  gene-specific  regulation  of  transcription,  hormone  signaling,  the 
cell  cycle,  differentiation,  and  DNA  repair.  Class  I  HDACs  have  been  shown  to  as¬ 
sociate  with  the  silencing  mediator  for  the  retinoid  and  thyroid  hormone  receptor 
complex  (SMRT)  41  the  CoREST  complex,  as  well  as  the  Sin3  and  Mi-2/NuRD  core¬ 
pressor  complexes.42,43  HDACI  and  2  are  part  of  the  core  complex  along  with 
RbAp46/48.  The  Sin3  complex  consists  of  this  core  complex  in  addition  to  SAP  18 
and  30,  which  aid  in  stabilizing  the  protein  interactions;  and  mSin3A,  which  serves 
as  the  scaffold  for  the  assembly  of  the  complex  44  The  NuRD  complex  contains  the 
core  complex  along  with  MTA2,  CHD3,  and  CHD4,  all  of  which  contain  DNA  heli- 
case/ATPase  domains  45  HDACI  and  2  are  found  in  the  CoREST  complex,  but  un¬ 
like  the  other  complexes,  neither  RbAp46  nor  RbAp48  is  present.  The  remaining 
components  are  proteins  homologous  to  MTA1  and  2,  called  CoREST  and  pi  10,  re¬ 
spectively.46  Members  of  the  class  I  HDACs  have  also  been  found  in  association  with 
Rb, 27,47  DNA  methyltransferase  l,48,49  TGIF/Smads,50  glucocorticoid  receptor,51 
and  Spl.52  Recently,  HDAC3  was  shown  to  form  a  complex  with  N-CoR  (nuclear 
receptor  corepressor),53,54  and  this  corepressor  complex  inhibits  JNK  activation 
through  an  integral  subunit,  GPS2.55 

Recent  work  has  implicated  posttranslational  modifications  of  HDAC  in  regulat¬ 
ing  HDAC  activity  and  association  potential.  Galasinski  et  al.  have  shown  that  phos- 
phorylated  HDACI  and  2  had  a  small  increase  in  activity  relative  to  that  observed  in 
the  nonphosphorylated  HDACs  and  that  this  increase  was  reversed  upon  phosphatase 
treatment.56  These  investigators  went  on  to  show  that  phosphorylation  disrupted 
HDACI  and  2  complex  formation  as  well  as  the  interaction  between  HDACI  and 
mSin3  and  YY1  but  not  RbAp46/48.  Though  HDACI  has  been  shown  to  be  phos- 
phorylated  by  CK2,  cAMP-dependent  protein  kinase,  and  protein  kinase  G  in  vitro , 
HDAC2  is  uniquely  phosphorylated  by  CK2.  This  HDAC2  phosphorylation  pro¬ 
motes  enzymatic  activity  and  regulates  complex  formation,  but  has  no  effect  on  tran¬ 
scriptional  repression.57  David,  Neptune,  and  DePinho  have  proposed  another 
mechanism  of  regulation.  They  demonstrated  that  HDACI  is  a  substrate  for  SUMO- 
1  (small  ubiquitin-related  modifier)  modification  and  that  mutations  in  the  target 
residues  reduced  transcriptional  repression  without  affecting  the  ability  of  HDACI 
to  associate  with  mSin3.58  These  observations  suggest  that  SUMO-1  modification 
regulates  the  biological  effects  of  HDACI  by  potentiating  its  histone  deacetylase 
activity. 


Class  II HDACs 

Once  the  novel  yeast  deacetylase  Hdal  was  characterized,  several  groups  simul¬ 
taneously  isolated  some  of  the  human  homologues  using  database  searches.  From 
this,  HDAC4,  5,  6,  and  7  were  identified.59,60  Subsequently,  HDAC961  and 
HDAC1062  were  isolated  and  assigned  to  class  II.  These  HDACs  are  twice  as  large 
(~  1 000  amino  acids)  as  the  class  I  family  members,  and  most  have  a  COOH  terminus 
catalytic  domain,  except  for  HDAC6,  which  has  a  second  catalytic  domain  in  the 
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NH2  terminus.  HDAC10  has  an  NH2  terminus  catalytic  domain  and  a  COOH  termi¬ 
nus  pseudorepeat  that  shares  homology  with  the  catalytic  domain.  Class  II  HDACs 
are  also  sensitive  to  HDACi;  but,  unlike  class  I  HDACs,  class  II  HDACs  are  cyto¬ 
plasmic  and  are  shuttled  to  the  nucleus  as  they  are  needed.  HD  AC  10  is  an  exception, 
as  it  has  been  shown  to  be  a  nuclear  protein.  Class  II  HDACs  are  also  differentially 
expressed  in  human  tissue,  with  the  highest  levels  being  found  in  the  heart,  brain, 
and  skeletal  muscle.24,43 

Class  II  HDACs  have  also  been  shown  to  be  a  part  of  larger  multiprotein  com¬ 
plexes.  HDAC4  and  5  associate  with  HDAC359  and  form  a  complex  with  N-CoR  and 
SMRT.63  The  association  with  HDAC3  has  been  shown  to  be  regulated  by  14-3-3. 
Interaction  of  HDAC4  or  5  with  14-3-3  proteins  sequesters  the  protein  in  the  cyto¬ 
plasm.  When  this  interaction  is  lost,  HDAC4  and  5  translocate  to  the  nucleus  and  as¬ 
sociate  with  HDAC3  and  repress  gene  expression.64  A  similar  mechanism  has  been 
proposed  for  the  regulation  of  the  importin-a-HDAC4  association  by  14-3-3. 65  A  re¬ 
cent  study  demonstrated  that  the  catalytic  domain  of  HDAC4  interacts  with  HDAC3 
through  N-CoR/SMRT.  The  authors  of  this  study  suggest  that  class  II  HDACs  regu¬ 
late  transcription  by  bridging  the  SMRT/N-CoR-HDAC3  complex  and  select  tran¬ 
scription  factors  independently  of  HDAC  activity.66  The  recently  identified 
HDAC10  also  interacts  with  SMRT  as  well  as  with  HDAC2.62 

A  common  NH2  terminal  extension  in  HDAC4,  5,  and  7  allows  them  to  interact 
with  the  MEF2  family  of  transcription  factors  once  they  translocate  from  the  cyto¬ 
plasm  to  the  nucleus.  These  interactions  play  an  important  role  in  activating  muscle- 
specific  genes  and  differentiation  in  both  smooth  and  skeletal  muscle.67,68  Class  II 
HDACs  have  also  been  reported  to  interact  with  the  COOH  terminal  binding  protein 
(CtBP)  and  repress  MEF2-mediated  transcription.69 

The  1 1th  member  of  the  HDAC  family  was  recently  cloned  and  characterized;  in¬ 
terestingly,  it  has  properties  seen  in  both  classes  of  HDACs.70  The  protein  is  347 
amino  acids  long,  with  homology  in  the  core  catalytic  domains  to  both  class  I  and 
class  II  HDACs.  The  size  of  the  protein  is  in  line  with  class  I  HDACs,  but  HDACI  1 
is  differentially  expressed  in  the  heart,  brain,  skeletal  muscle,  and  kidney,  which  is 
typical  of  class  II  HDACs.  The  protein  is  predominantly  nuclear,  and  like  its  family 
members,  HDAC  1 1  is  sensitive  to  HDACi.  HDAC  1 1  associated  with  complexes  that 
contained  HDAC6,70  which  has  recently  been  shown  to  function  as  a  tubulin 
deacetylase.71 


Class  III  HDACs 

The  third  family  of  histone  deacetylases,  sirtuins,  are  homologues  of  the  yeast 
Sir2  gene,  which  has  been  implicated  in  chromatin  silencing,  cellular  metabolism, 
and  aging.72  There  are  seven  sirtuins  in  humans,  SIRT1-7,  most  of  which  average 
around  300-400  amino  acids,  except  for  SIRTI  which  has  747  (Table  1).  The  cata¬ 
lytic  domains  average  275  amino  acids  and  contain  two  CXXC  motifs  that  function 
as  zinc  finger  domains73  and  at  least  one  hydrophobic  region  that  potentially  func¬ 
tions  as  a  leucine  zipper.74  The  histone  deacetylase  activity  of  these  enzymes  is  de¬ 
pendent  upon  NAD+,75  and  the  yeast  Sir2  has  intrinsic  ADP-ribosyltransferase 
activity.73  Mutational  analysis  indicated  that  Gly  270  and  Asri  345  are  critical  amino 
acids  whereby  deacetylase  activity  was  abolished  in  the  345  mutant  and  diminished 
in  the  270  mutant.75.  ADP-ribosyltransferase  activity  was  also  abolished  in  the  345 
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mutant  and  severely  decreased  in  the  270  mutant.75  Immunofluorescence  studies 
have  demonstrated  that  unlike  the  yeast  Sir2,  human  Sir2  does  not  localize  in  the  nu¬ 
cleus.76  Though  the  field  of  human  sirtuins  is  in  its  infancy,  some  interesting  devel¬ 
opments  are  starting  to  emerge,  the  foremost  being  the  association  of  SIRT1  with 
p53.  SIRT1  has  been  shown  to  specifically  associate  with  and  deacetylate  p53,  there¬ 
by  repressing  p53-mediated  transcriptional  activation,  which  prevents  growth  inhi¬ 
bition  or  apoptosis  in  response  to  DNA  damage.30,31,77-79  These  findings  could  have 
a  tremendous  impact  on  p53-based  cancer  therapy,  as  inhibitors  of  SIRT1  could  be 
used  in  combination  with  current  therapeutic  protocols  to  enhance  efficacy. 


THE  EPIGENOME  AND  ACETYLATION 

Epigenetic  changes  of  the  genome  include  DNA  methylation  and  modifications 
of  histones.  In  humans,  DNA  cytosine  methyltransferases  (Dnmtl,  Dnmt3a, 
Dnmt3b)  usually  add  a  methyl  group  to  the  5'-carbon  of  a  cytosine  located  next  to  a 
guanine  (5'-CpG-3').  These  CpG  sequences  are  found  in  islands  mainly  in  the  5'-re- 
gions  such  as  the  promoter,  first  exon,  and  sometimes  in  the  first  intron  of  house¬ 
keeping  genes  as  well  as  tissue-specific  genes.  Although  most  CpG  islands  are 
unmethylated  in  normal  cells,  they  could  become  methylated  during  development, 
differentiation,  or  cancer  and  play  a  part  in  gene  regulation.  Among  the  histone  mod¬ 
ifications,  acetylation  of  core  histone  tails  has  been  shown  to  be  dependent  on  the 
opposing  activities  of  two  types  of  enzymes,  histone  acetyltransferases  (HATs)  and 
histone  deacetylases  (HDACs).  HATs  acetylate  the  8-groups  of  the  lysine  residues  of 
the  histone  tails,  and  their  removal  by  HDACs  restores  the  positive  charge  on  these 
residues.  Actively  transcribed  regions  of  the  chromatin  are  generally  enriched  with 
highly  acetylated  histones  H3  and  H4  in  euchromatic  regions  of  the  genome.  Meth¬ 
ylation  of  histones  by  proteins  bearing  the  SET  (Su[var],  Enhancer  of  zestes,  tritho¬ 
rax)  domain  also  targets  lysine  residues.  Distinct  methyl  transferases  (H3-K4 
methyltransferase  and  H3-K9  methyltransferases  such  as  Suv39hl,  Suv39h2,  G9a, 
ESET/SetDBl  and  Eu-HMTase)  methylate  histone  H3  either  at  lysine  4  (H3-meK4), 
lysine  9  (H3-meK9),  or  other  lysine  residues.  The  regions  of  chromatin  with  H3- 
meK4  modifications  usually  harbor  lysine  9  modified  by  acetylation  (H3-AcK9), 
marking  active  euchromatin;  while  the  presence  of  H3-meK9  is  correlated  with  con¬ 
densed  heterochromatin.80,81  The  chromodomain  of  HP1  (heterochromatin  protein 
1)  binds  to  H3-meK9  with  high  affinity  and  is  involved  in  heterochromatin  assembly 
through  the  oligomerization  of  the  HP1  proteins.82,83  Furthermore,  H3-meK4  inhib¬ 
its  the  binding  of  the  nucleosome  remodeling  deacetylase  (NuRD)  repressor  com¬ 
plex  to  H3  histone  tails  to  ensure  disruption  of  the  silencing  process  by  protein- 
protein  interactions,  thereby  resulting  in  expression  of  otherwise  silent  genes. 

The  multisubunit  complex  with  ATPase  activity  known  as  SWI/SNF  in  human 
cells  consists  of  either  of  the  two  ATPase  subunits,  BRG1  (a  human  homologue  of 
the  yeast  Swi2/Snf2)  or  hBRM.  The  human  SWI/SNF  complexes  play  a  major  role 
in  chromatin  remodeling  and  are  not  only  enriched  in  active  chromatin  but  also 
present  and  found  to  form  complexes  with  corepressors  such  as  Sin3,  HDAC1, 
HDAC2,  HDAC3,  N-CoR,  and  KAP1  (krab-associated  protein  1).  These  observa¬ 
tions  suggest  that  SWI/SNF  plays  important  roles  in  both  regulation  of  transcription 
and  gene  repression.84  Prominent  examples  illustrating  the  differential  effects  of 
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Histone  coding  for  silencing 


|  Histone  coding  for  transcription 


Euchromatin 


Active  histone  code  (AHC) 


FIGURE  1.  The  role  of  HDACs  in  the  histone  code.  The  molecular  details  of  the  vari¬ 
ous  modifications  of  the  epigenome  in  relation  to  the  heterochromin  and  euchromatin  can 
be  found  in  the  text.  Abbreviations:  HAT,  histone  acetyltransferase;  HDAC,  histone  deacety- 
lase;  MBP,  Methyl  Binding  Protein  ;  MeCP,  methyl-CpG  binding  protein;  H3-meK4,  histone 
H3  methylated  at  lysine  4;  H3-meK9,  histone  H3  methylated  at  lysine  9;  H3-AcK9,  histone 
H3  acetylated  at  lysine  9;  DNMT,  DNA  methyl  transferase;  SWI/SNF,  chromatin  remodel¬ 
ing  multiprotein  complex  with  ATPase  activity;  Me  in  a  triangle  denotes  DNA  CpG  methy- 
lation;  Me  in  a  circle  denotes  histone  methylation;  Ac  ,  histone  acetylation;  HMT,  histone 
methyltransferase;  p300/CBP,  CREB  binding  protein;  DDM,  DNA  demethylase;  HDM,  hi¬ 
stone  demethylase. 


SWI/SNF  are  the  interaction  of  the  retinoblastoma  (Rb)  tumor  suppressor  protein  (1) 
with  both  BRG1  and  hBRM  to  form  a  hSWI/SNF  repressor  complex  regulating  ex¬ 
pression  of  cyclins  and  cyclin-dependent  kinases  (cdks)  during  S  phase  and  (2)  with 
HDACs  to  repress  certain  genes  such  as  cyclin  E  during  the  G1  phase.85"89  Interest¬ 
ingly,  studies  with  the  filamentous  fungi  Neurospora  crassa  showed  that  mutation  of 
H3-lys9  resulted  in  a  loss  of  DNA  methylation  in  vivo,  suggesting  that  H3-lys9  me¬ 
thylation  could  be  coupled  to  DNA  methylation  in  other  organisms  and  that  a  similar 
mechanism  may  play  a  role  in  silencing  chromatin  in  mammals. 

These  studies  lead  us  to  believe  that  DNA  methyltransferases  might  be  taking 
cues  from  the  histone  code.  The  two  repression  mechanisms,  DNA  methylation  and 
histone  deacetylation,  are  apparently  connected  by  the  methyl-CpG  binding  proteins 
(MBPs),  such  as  MeCP2,  MBD1,  MBD2,  MBD3,  MBD4,  and  Kaiso  or  DNA  methyl 
transferases.  The  MeCP2  protein  can  interact  with  histone  deacetylases  (HDAC1 
and  HDAC2)  via  the  corepressor  Sin3.  On  the  other  hand,  MBD2  is  also  associated 
with  HDAC  1 ,  which  interacts  with  the  Sin3  or  NuRD  complex.44  Other  MBPs  are 
also  believed  to  recruit  HDAC  activity  44,90  Direct  interactions,  as  well  as  interac- 
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tions  in  a  complex  between  the  various  DNA  methyltransferases — Dnmtl,  Dnmt3a, 
and  Dnmt3b  as  well  as  a  Dnmt3  family  homologue,  DnmtL — have  been  demonstrat¬ 
ed.49,9 1,92  These  observations  strongly  suggest  that  the  distribution  of  the  acetyla¬ 
tion  of  histones  in  the  chromatin  resulting  from  the  complex  nature  of  the  epigenome 
marks  the  status  of  gene  expression  (Fig.  1). 


HDAC  INHIBITORS  AS  CANCER  PREVENTIVE  AND 
THERAPEUTIC  AGENTS 

HDAC  inhibitors  cause  the  accumulation  of  acetylated  histones  in  nucleosomes, 
which  results  in  the  expression  of  a  specific  set  of  genes  that  can  lead  to  cell  arrest, 
differentiation,  or  apoptosis.  Therefore,  they  have  the  potential  for  use  in  the  chemo- 
prevention  and  treatment  of  cancer  49,91-94  Inhibitors  of  HDACs  have  been  isolated 
from  natural  sources  as  well  as  derived  from  synthetic  compounds,  as  summarized 
in  Table  2.  Many  different  structural  classes  of  HDAC  inhibitors  have  been  report¬ 
ed,  including: 

(1)  short-chain  fatty  acids — e.g.,  sodium  n-butyrate  (NaBu);95,96 

(2)  hydroxamic  acids,  such  as  trichostatin  A  (TSA,)97  suberoylanilide  hydrox- 
amic  acid  (SAHA,)98-100  and  Oxamflatin;101 

(3)  cyclic  tetrapeptides  containing  a  2-amino-8-oxo-9,10-epoxy-decanoyl 
(AOE)  moiety — e.g.,  trapoxin;10,102 

(4)  cyclic  tetrapeptides  without  an  AOE  moiety — e.g.  apicidin 103,1 04  and 
FR901228105;  and 

(5)  benzamides— e.g.,  MS-27-275.106,107 

Sodium  n-butyrate  (NaBu)  is  a  nonspecific  inhibitor,  which  has  been  shown  to  re¬ 
duce  the  proliferation  of  many  tumor  cell  lines,  enhance  diffrentiation,  and  stimulate 
apoptosis,  leading  to  decreased  viability  of  cells.95,96,108, 109  The  butyrates  are  the 
only  class,  to  date,  that  have  been  approved  for  clinical  use;  but  they  are  far  from  ide¬ 
al  inhibitors,  as  they  are  nonspecific,  exerting  effects  on  multiple  enzyme  systems, 
and  the  dose  required  to  inhibit  deacetylation  is  in  the  millimolar  range.  A  number 
of  investigators  have  shown  that  sodium  butyrate  enhances  the  efficacy  of  retinoic 
acid  (RA)  in  a  number  of  cell  lines,  including  the  S91  melanoma  line.110  TSA,  orig¬ 
inally  developed  as  an  antifungal  agent,  is  a  potent  and  reversible  inhibitor  of  histone 
deacetylase;  nanomolar  concentrations  of  it  inhibit  deacetylase  activity,  targeting  the 
cell  cycle  progression  of  several  cell  types,  inducing  cell  growth  arrest  at  both  the 
G1  and  G2/M  phases,  and  in  some  cases  also  inducing  apoptosis.18,96,97,111-113  TSA 
inhibition  of  HDACs  has  been  shown  to  alter  gene  expression  (twofold  increase  or 
decrease)  in  roughly  2%  of  expressed  genes,  suggesting  that  the  action  of  TSA  is  se¬ 
lective.114  Similar  results  were  also  observed  in  transformed  cultured  cells  treated 
with  SAHA.93  SAHA100  is  a  cell-permeable  inhibitor  of  HDACs  that  structurally  re¬ 
sembles  TSA.  SAHA  has  been  shown  to  induce  growth  inhibition,99,115  differentia¬ 
tion,100,116  and  apoptosis  in  a  variety  of  cell  types,  including  ARP-1  multiple 
myeloma  cells,  the  LNCaP  prostate  cancer  cell  line,  and  U937  leukemia  cells.93 
SAHA  also  induces  caspase-dependent  apoptosis  and  downregulation  of  daxx  in 
acute  promyelocytic  leukemia  with  t(  1 5;  1 7),1 17  as  well  as  antiangiogenesis  activity 
by  altering  VEGF  signaling  in  HUVEC  cells.118  In  in  vivo  studies,  the  incidence  of 
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TABLE  2.  HDAC  inhibitors 


Class  of  inhibitor 

Example 

Optimal 

concentration 

Reference 

Short  chain  fatty  acid 

butyrates 

1.5  mM 

93,  139 

Hydroxamic  acids 

trichostatin  A 

40-70  nM 

102 

SAHA 

2-5  pM 

100 

oxamflatin 

pM  range 

101 

Cyclic  tetrapeptides  with  AOE  moiety 

trapoxin  A 

50  nM 

10 

Cyclic  peptides  without  AOE  moiety 

FR901228 

pM  range 

105 

apicidin 

2—4  nM 

104 

Benzamides 

MS-27-275 

2-5  pM 

107 

mammary  tumors  was  reduced  by  40%  and  the  mean  tumor  volume  by  78%  without 
any  side  effects  when  rats  with  methylnitrosourea-induced  mammary  carcinomas 
were  fed  SAHA  (900  parts/million.)99  Two  other  studies  also  revealed  inhibition  of 
tumor  growth  by  SAHA  in  mice  with  lung  cancer  induced  by  administration  of  4- 
(methylnitrosoamino)-l-(3-pyridyI)-l-butanone  and  in  nude  mice  transplanted  with 
CWR22  androgen-dependent  prostate  cancer.93 

Other  HDACi  shown  to  inhibit  tumor  growth  in  animal  models  include  oxamfl- 
atin,  MS-27-275,  and  azeloic  bishydroxamate.  In  many  of  these  cases,  no  toxicity, 
evaluated  by  weight  gain  and  histologic  examination,  was  observed.  Trapoxin  (TPX) 
[cyclo-(L-phenylalanyl-L-phenyIalanyl-D-pipecolinyl-L-2-amino-8-oxo-9,10-ep- 
oxy-decanoyl)]  is  a  fungal  product  that  can  induce  morphological  reversion  of  trans¬ 
formed  NIH3T3  fibroblasts.  Removing  an  epoxide  group  in  trapoxin  completely 
abolished  the  inhibitory  activity,  which  suggests  that  trapoxin  binds  covalently  to  the 
histone  deacetylase  via  the  epoxide  group.  Trichostatin  A  reversibly  inhibits 
HDACs,  whereas  trapoxin  causes  inhibition  by  irreversible  binding  to  the  HDAC. 
However,  they  have  been  shown  to  induce  nearly  identical  biological  effects  on  the 
cell  cycle  and  differentiation.102,119  Apicidin  [cyclo(A-0-methyl-L-tryptophanyl-L- 
isoleucinyl-D-pipecoliny!-L-2-amino-8-oxodecanoyl)]  is  a  fungal  metabolite  shown 
to  inhibit  both  mammalian  and  protozoan  HDACs  (IC5q  =  0.2-1 .5  nM).  Apicidin  can 
lead  to  a  morphological  reversal  and  growth  inhibition  of  H-ras  MCF10A  cells  sim¬ 
ilar  to  that  induced  by  other  HDAC  inhibitors.103  The  growth  inhibition  of  apicidin 
on  HeLa  cells  is  accompanied  by  morphological  changes,  cell  cycle  arrest  at  the  G1 
phase  with  increased  induction  of  p21/WAFl/Cipl  and  decreased  phosphorylation 
of  the  Rb  protein,  and  accumulation  of  hyperacetylated  histone  H4. 103,104  In  another 
study,  apicidin  was  shown  to  induce  apoptosis  and  Fas/Fas  ligand  expression  in  the 
human  acute  promyelocytic  leukemia  cells  HL60.120  The  newly  synthesized  benza- 
mide  derivative  of  MS-27-275  can  induce  p21  (WAF1/CIP1)  and  gelsolin,  resulting 
in  an  altered  cell  cycle  distribution.106,107  In  some  studies,  an  increase  in  the  accu¬ 
mulation  of  acetylated  histones  H3  and  H4  was  detected  in  the  TpR  II  promoter  after 
treatment  with  MS-275,  and  MS-27-275  was  able  to  induce  an  increase  in  TGF-be- 
taRII  mRNA  to  restore  TGF-beta  signaling.121  HDACi  have  also  been  used  in  com¬ 
binational  therapy  most  frequently  with  retinoic  acids  in  hematological 
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cancers. no> 122-1 27  This  area  of  study  is  more  thoroughly  described  in  a  review  by 
Pandolfi.128 

The  mechanisms  of  inhibition  of  HDACs  by  these  inhibitors  are  coming  to  light 
with  the  resolution  of  the  structure  of  the  catalytic  core  of  the  HDACs.129  HDACs 
have  a  homologous  390— amino  acid  catalytic  core,  and  the  residues  that  form  the  ac¬ 
tive  site  are  conserved  across  all  HDACs.  An  HD  AC  homologue  in  Aquifex  aeolicus 
called  HDLP  was  used  in  crystallography  studies  to  analyze  an  HDLP-TSA  and 
HDLP-SAHA  complex.  These  studies  revealed  that  a  tubular  pocket,  a  zinc  binding 
site,  and  two  asparagine-histidine  charge  relay  systems  form  the  active  catalytic  site 
of  HDLP.  The  hydroxamic  acid  moieties  of  TSA  and  SAHA  bind  to  the  zinc  in  the 

tubular  pocket,  and  this  interaction  is  believed  to  be  critical  in  inhibiting  the 
93  129 

enzyme. 

Recently,  there  have  been  reports  on  inhibitors  of  sirtuins,  which  in  general  are 
not  inhibited  by  these  types  of  HDACi.  Identifying  and  generating  inhibitors  to  this 
class  of  HDACs  would  expedite  the  dissection  of  their  biological  functions  and,  in 
the  long  term,  could  possibly  be  used  in  combinational  therapy,  especially  in  light  of 
the  interaction  between  SIRT1  and  p53.  Nonhydrolyzable  NAD  analogues  have  been 
used,  but  they  are  problematic  in  that  they  nonspecifically  inhibit  other  NAD-depen- 
dent  enzymes.  Small  molecules  that  contain  a  2-hydroxyl- 1-napthol  moiety  have 
been  developed  and  have  been  shown  to  inhibit  sirtuins.1^11  These  compounds  may 
be  the  building  blocks  of  an  approach  to  find  specific  inhibitors  to  each  of  the  sir¬ 
tuins,  allowing  the  delineation  of  the  role  of  these  proteins  in  transcriptional  regula¬ 
tion,  cell  growth,  DNA  repair,  apoptosis,  and  development. 


CONCLUSIONS  AND  FUTURE  PERSPECTIVES 

It  is  becoming  evident  that  the  key  to  effectively  using  the  information  provided 
by  the  Human  Genome  Project  hinges  on  the  accurate  interpretation  of  the  histone 
code.  The  roles  for  HDACs  in  the  histone  code  and  transcriptional  regulation  are  be¬ 
coming  clearer,  but  the  identification  of  splice  variants  of  some  of  the  HDACs  and 
posttranslational  modifications  to  the  HDACs  shows  just  how  complex  the  regula¬ 
tion  of  these  enzymes  and  the  complexes  that  they  are  found  in  can  be.  The  ever 
growing  list  of  HDACi  will  help  to  elucidate  the  roles  of  these  HDACs  in  mediating 
growth  arrest,  differentiation,  and  cell  death.  The  identification  of  HDACi  specific 
to  HDACs  involved  in  these  processes  through  regulation  of  expression  of  a  defined 
set  of  genes  affected  in  cancer  would  be  of  great  value  in  cancer  prevention  and  ther¬ 
apy  and  will  continue  to  be  a  major  focus  of  research  in  these  fields. 
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